Survey and Positioning of Meaning-Generation OS (MG-OS)

kanna qed
3月11日
読了時間: 9分

1. Executive Summary & Strict Definition of MG-OS

This report clarifies the theoretical positioning of the Meaning-Generation OS (MG-OS) and its connections to prior research. To elevate the term "OS" from a mere metaphor to a rigorous architectural concept, MG-OS is defined as follows.

1.1 Definition and Operational Target of MG-OS

MG-OS is a "policy-governed candidate-control layer" that applies operations such as retain, suppress, select, abstain, and defer to an internal candidate set $S$, without assuming simple "mixing" as a default premise.

Canonical Layer (Fixing the Primary Target): Hereafter, the canonical operational target of MG-OS in this report is defined as "a finite set of latent semantic hypotheses or response plans maintained prior to final commitment." Tokens (token-level), experts (expert-level), and value-hypotheses are considered specific implementation derivations of this canonical layer, not the primary definition of MG-OS itself. This strictly determines "which layer" MG-OS addresses.
Permitted Operations: Rather than collapsing candidate sets via weighted averaging, it maintains them as independent states and applies discrete, non-continuous control based on downstream evaluation policies.

1.2 Primary and Secondary Definitions of "Minority-Important"

To ensure the "minority-important modes" protected by MG-OS are mathematically and control-theoretically tractable, we structure the definition as follows:

Primary Definition: Candidates that are easily obliterated by majority pressure (gradient or averaging attraction) but require "lower-bound retention" from the perspectives of loss, safety, or representation.
Secondary Definitions (Derivatives):
1. Representational minority: Features or semantic modes that appear infrequently in the data distribution but are essential in specific contexts.
2. Normative minority: Values and perspectives—often addressed in pluralistic alignment—that slip through majoritarian preference aggregation but must be ethically or socially protected.
3. Tail-risk minority: Worst-case hypotheses that have a low probability of occurrence but carry severe penalties if realized, which must be retained as triggers for "abstain/defer" from a safety perspective.

1.3 Necessary Conditions for MG-OS

To prevent various existing models from being loosely categorized as "variants of MG-OS," the necessary conditions are strictly established as follows:

Retention of Candidate Set: The candidate set $S$ must be maintained either explicitly or in an implicitly extractable form.
Non-Mixing Operations: Operations on candidates must include non-linear/discrete actions such as selection, suppression, and freezing, not merely convex combinations ("retention-aware selection prior to mixing").
Protection Constraints for Minority-Important Candidates: The avoidance of obliterating minority-important candidates must be integrated as a penalty in the objective function or as an explicit constraint.
Integration of Abstain/Defer: If a candidate meeting the criteria cannot be selected, the system must be capable of abstaining or deferring to external entities, rather than forcing an output.

2. Theoretical Framework: Four Barrier Typologies and Justification of the Principal Axes

2.1 Four Typologies of Barriers

Threshold Barrier $\to$ Lanes D / F: Constraints that block output and trigger reject/defer when the confidence of candidates falls below a certain criterion.
Capacity Barrier $\to$ Lane B: Constraints establishing upper/lower bounds on allocations to prevent representation from overly collapsing into specific candidates.
Representation Barrier $\to$ Lanes E1 / G: Lower-bound constraints requiring that specific values or protected groups are included above a certain ratio within the retained set.
Energy Barrier $\to$ Lane C: The physical substrate that controls the state-space topography to prevent shallow local optima (minor modes) from being swallowed by majoritarian attractors.

2.2 Why B×D×(E1/E2/G)×F instead of C?

While Lane C (energy landscapes) provides a powerful descriptive language for "how minor modes are retained and converge," the core of MG-OS lies in "the integration of control objectives provided by B, D, E1, E2, F, and G." Specifically, the essence of MG-OS is the implementation of a policy layer determining how to select (B), how to abstain (D/F), and what to protect (E1/E2/G).

2.3 Connecting Internal Retention and External Output (Bridge Proposition and Experimental Hypothesis)

Many studies in the E lanes deal with the fairness and pluralism of the "final model response (external output)." The justification for bringing this into the internal representation layer is secured by the following theoretical proposition and engineering hypothesis.

2.3.1 Bridge Proposition (Sufficient Condition Version)

"Output-side pluralistic constraints alone may be insufficient to consistently guarantee stable pluralism under distribution shifts or irreversible aggregation. The intentional retention of minority-important candidates at the internal candidate stage serves as a natural sufficient design strategy to stabilize this pluralism."

2.3.2 Experimental Hypothesis (Comparative Experiment Version)

"Under matched average utility, a system equipped with candidate-level retention + abstain/defer (MG-OS) should exhibit superior performance in worst-group utility, collapse rate, and abstention quality compared to an output-only pluralistic baseline (e.g., simple diversity penalties on the output side)."

3. Prior Research Map and the Most Competitive Domains for Novelty

This section clarifies the mapping of designated lineages (Lanes A–G) and establishes strict delineations against the "competitive domains" that pose the greatest challenge to MG-OS's novelty claims.

3.1 Mapping of the Eight Lanes (A, B, C, D, E1, E2, F, G)

A: Attention / sparse attention / mixing: Output relies on weighted mixing. Fundamentally aims for "smooth generalization" and assumes mixing as a premise.
B: Routing / discrete selection / MoE: The source of Capacity Barriers. Addresses load balancing and representation collapse, but the primary goal is computational efficiency.
C: Energy-based / Hopfield / attractor: The source of Energy Barriers. Provides the language to discuss the stability of multiple modes.
D: Abstention / selective classification / learning to defer: The source of Threshold Barriers. Provides the logic for risk coverage and delegation.
E1: Pluralistic alignment / constitutional pluralism: Operationalization of pluralism (Sorensen, Modular Pluralism, CCAI). Addresses which values to treat pluralistically.
E2: Social choice / legitimacy / preference aggregation: The hierarchy of legitimacy (Conitzer, etc.). Addresses whose inputs to aggregate and how to legitimize them.
F: Set-valued prediction / conformal prediction / generative prediction sets: Frameworks for retaining and outputting candidate sets. The primary competitor regarding guaranteed withholding.
G: Fairness / ranking / minority protection: The source of Representation Barriers. Mathematical components for minority protection, such as group DRO and FA*IR.

3.2 Comparison with the Most Competitive Domains (Lines of Defense)

When formalizing MG-OS as "the retention and selection of intermediate semantic candidates," the following four domains emerge as direct competitors. MG-OS distinguishes itself from them through the "integration of minority-important retention + abstain/defer + policy constraint."

3.2.1 Diverse decoding (e.g., Diverse Beam Search)

Optimization Target: Dissimilarity among candidates (diversity for variety).
Candidate Retention: Retained only during decoding time.
Minority-important retention: Not explicitly defined (lacks a normative framework for the "importance" of minorities).
Integration of Abstain/defer: None.
Key Difference from MG-OS: It is not merely ensuring variety, but providing protection based on policy constraints (retention for protected hypotheses), functioning as a higher-level control layer with abstention mechanisms.

3.2.2 Set-valued classification / prediction sets

Optimization Target: Ambiguity, conformal validity (coverage guarantees).
Candidate Retention: Explicitly retained and output as a set.
Minority-important retention: Usually absent (guarantees the probability of including the true label, but does not protect specific values).
Integration of Abstain/defer: While set-output itself is a broad form of abstention, it lacks the logic for external delegation (defer).
Key Difference from MG-OS: Goes beyond uncertainty-aware candidate output frameworks by acting as a policy-governed layer that controls candidates under normative minority constraints.

3.2.3 Committee / ensemble disagreement preservation

Optimization Target: Epistemic uncertainty, disagreement preservation.
Candidate Retention: Retained as a set of models or heads.
Minority-important retention: Retained simply because of a "lack of consensus," not inherently because the minority opinion is important.
Integration of Abstain/defer: Frequently included (e.g., abstaining when disagreement is high).
Key Difference from MG-OS: It focuses on "protected minority-retention with normative semantics" based on safety and ethical norms, rather than merely preserving epistemic uncertainty.

3.2.4 Submodular / diversity-constrained selection

Optimization Target: Coverage, combinatorial diversity.
Candidate Retention: Retained and selected during the optimization process.
Minority-important retention: Does not dictate what is worth protecting (lacks a policy objective).
Integration of Abstain/defer: Generally absent.
Key Difference from MG-OS: Submodular selection is not a substitute for MG-OS but rather a "mathematical tool/component" used to implement the Representation Barrier.

4. Evaluation Design (Core Research Plan)

The effectiveness of MG-OS is quantified through the following five groups of metrics, elevating the concept from ideology to a concrete research plan.

Retention Metric: Did minority-important candidates survive the majority pressure?
- Minority Recall / Collapse Rate
Selection Metric: Were they appropriately evaluated and salvaged during final generation?
- Average Utility vs. Worst-Group Performance
Abstention Metric: Did the system successfully abstain without making erroneous selections when unable to decide?
- Abstention Quality / Deferral Rate
Pluralism-Fidelity Metrics: Which community's values were retained, and to what extent? (Connection with Lane E1)
- Community-conditioned utility / Stakeholder-conditioned retention / Distributional pluralism gap
Legitimacy / Aggregation Metrics: Was the input selection process itself legitimate? (Connection with Lane E2)
- Input-representation coverage / Constitution fidelity / Aggregation sensitivity

5. Review of Key Papers (Reorganized into Three Groups)

5.1 Core papers (Directly linked to the ideology and framework of MG-OS)

A Roadmap to Pluralistic Alignment (Sorensen et al., 2024) [E1]: Provides a research map for pluralism and points out the limitations of converging to a single average value (e.g., Overton collapse).
Collective Constitutional AI (Anthropic, 2023/2024) [E1]: An implementation incorporating public input into a constitution. The normative source of Representation Barriers.
SelectiveNet (Geifman & El-Yaniv, 2019) [D]: Integrates rejection into deep models. The core of Threshold Barriers.
group DRO (Sagawa et al., 2019) [G]: Worst-group optimization. The objective function framework preventing the sacrifice of minority groups.

5.2 Boundary papers (Design components, physical substrates, and peripheral legitimacy)

Plurality of Value Pluralism and AI Value Alignment (Kasirzadeh, 2024) [E2]: Discusses the "second-order legitimacy" of pluralism (choosing whose values).
Social Choice Should Guide AI Alignment (Conitzer et al., 2024) [E2]: Approaches ensuring the legitimacy of preference aggregation via social choice theory.
Machine Learning with a Reject Option: A Survey (Hendrickx et al., 2021) [D]: Systematization of evaluation axes and classifications for reject-options.
Sparsely-Gated MoE (Shazeer et al., 2017) [B]: The prototype of discrete selection underlying Capacity Barriers.
Modern Hopfield Networks (Ramsauer et al., 2020) [C]: The physical and mathematical basis for candidate convergence and retention via attractors.
FA*IR (Zehlike et al., 2017) [G]: Protection of representation through prefix constraints.

5.3 Competing papers (Targets for proving the novelty differential)

Set-valued classification: overview via a unified framework (Chzhen et al., 2021) [F]: A unified framework extending ambiguous predictions to candidate sets.
Generative Prediction Sets (2025) [F]: Set outputs and coverage guarantees for deep generative models.
Diverse Beam Search (Vijayakumar et al., 2016) [F]: Diversity optimization during decoding.

6. Comparison of Prior Research Lineages and MG-OS Requirements (Complete Version)

The following illustrates what existing research lineages lack to become a "policy-governed candidate-control layer" like MG-OS. Structured for optimal readability.

[Lane A] Attention / Softmax

Characteristics: Pluralism target: token variety / Guarantee: None / Barrier: None
Key difference from MG-OS: Minority-important modes are irreversibly diluted by weighted mixing.
To become MG-OS-like: Introduction of an explicit mechanism for "independent candidate retention" prior to mixing.

[Lane B] MoE / Routing

Characteristics: Pluralism target: None / Guarantee: capacity / Barrier: Capacity
Key difference from MG-OS: Aims for resource allocation and computational efficiency, not the protection of semantics.
To become MG-OS-like: A shift in objective from resource allocation to "minority-important preservation."

[Lane C] Modern Hopfield

Characteristics: Pluralism target: memory patterns / Guarantee: energy stability / Barrier: Energy
Key difference from MG-OS: Does not treat minority protection as a normative requirement.
To become MG-OS-like: Introduction of evaluation functions or temperature controls to intentionally maintain minor modes.

[Lane D] SelectiveNet / Defer

Characteristics: Pluralism target: None / Guarantee: risk-coverage / Barrier: Threshold
Key difference from MG-OS: Involves screening or delegating inputs, not controlling internal candidates.
To become MG-OS-like: Extension to the "control of the entire generated candidate set $S$" and triggers for deferral upon failure to meet internal constraints.

[Lane E1] Pluralistic Alignment / CCAI

Characteristics: Pluralism target: community values / Guarantee: constitutional consistency / Barrier: Representation
Key difference from MG-OS: Constraints remain at the external response level, lacking internal retention mechanisms.
To become MG-OS-like: Internalization of "external response constraints" into the internal candidate generation and retention stage.

[Lane E2] Social Choice

Characteristics: Pluralism target: public legitimacy / Guarantee: aggregation fairness / Barrier: Representation
Key difference from MG-OS: Questions the legitimacy of aggregation but does not function as a control layer during inference.
To become MG-OS-like: Translating established legitimate rules into runtime candidate retention policies.

[Lane F] Set-valued / Prediction Sets

Characteristics: Pluralism target: candidate ambiguity / Guarantee: conformal validity / Barrier: Threshold / Capacity
Key difference from MG-OS: Provides ensemble-like ambiguity coverage, not normative protection for specific values.
To become MG-OS-like: Expansion from ambiguity coverage to the "policy-governed retention" of minority-important values.

[Lane G] group DRO / FA*IR

Characteristics: Pluralism target: group fairness / Guarantee: worst-group / representation / Barrier: Representation
Key difference from MG-OS: Acts as a constraint on final outputs or rankings, largely ignoring intermediate states during inference.
To become MG-OS-like: Applying simple list constraints to the retention process of latent semantic hypotheses.

7. Positioning Statement Draft

MG-OS is not a generic diversity mechanism, nor merely a reject-option framework, nor a pluralistic output policy. It is a candidate-level control layer that preserves minority-important semantic hypotheses under explicit policy constraints and retains the option to abstain or defer before irreversible commitment.

Thesis-Style Positioning Statement

This report defines MG-OS as a "policy-governed candidate-control layer that applies operations such as retain, suppress, select, abstain, and defer to an internal candidate set $S$." Under this rigorous definition, MG-OS is not merely a variant of attention mechanisms, but resides at the intersection of (i) collapse avoidance via discrete routing (Capacity barrier), (ii) decision-making theories for abstention/delegation (Threshold barrier), and (iii) minority protection via pluralistic alignment and worst-case optimization (Representation barrier). Recognizing that output-side pluralistic constraints alone may be insufficient, integrating minority-important retention at the internal candidate stage as a natural sufficient design strategy constitutes the core of this architecture.

8. Risk Assessment and Future Publication Guidelines

Confusion with Competitive Domains (e.g., Diverse decoding / Set-valued prediction): To counter research focusing on diversity optimization or ambiguity coverage, clearly highlight the difference between "mere preservation of variety or uncertainty" and "intentional protection/abstention of minority-important candidates based on policy constraints" using differing evaluation metrics (e.g., Pluralism fidelity).
Connection Conditions with Beacon / GD-Attention: When discussing MG-OS as a continuum with existing unique architectures, delineate the roles explicitly:
- Beacon: Primarily implements the Candidate Protection side.
- GD-Attention: Primarily implements the Candidate Selection (stabilization/selection) side.
- MG-OS: Stands atop these elemental technologies, functioning as an overall higher-level control layer that binds Protection, Selection, Abstention (reject/defer), and Norms (normative constraints like minority protection) together.