Cognitive Legitimacy & Algorithmic Legitimacy Shift (ALS): A Strict Minimax-Risk Proof (Version 10.0)
Subject: Statistical Decision Theory / Cognitive Legitimacy
Author: Manny (Ghost Drift Theoretical Group)
Date: 2026-01-21 (The Perfected Proof)
Abstract:
We introduce Algorithmic Legitimacy Shift (ALS): a decision-theoretic phenomenon in which, under structural coverage constraints ($B < J$), the algorithmic decision channel becomes strictly more legitimate than the human decision channel in the minimax-risk sense.
Formally, the Human Channel admits a sharp, non-vanishing minimax error floor on a noiseless subclass, while the Algorithm Channel achieves exponentially decaying minimax risk in the sample size $m$ under a positive margin condition $\Delta>0$.
Version 10.0 is mathematically closed by (1) defining an extended probability space including internal randomization to rigorously construct human risk; (2) giving an explicit uniform $B$-subset generator to witness achievability of the sharp bound; (3) enforcing consistent notation for induced measures; and (4) linking Bayes risk to minimax risk in the information-theoretic lower bound.
These results yield an explicit sufficient condition on $m$ under which $\mathfrak{R}^\star(\mathsf{Ch}_A) < \mathfrak{R}^\star(\mathsf{Ch}_H)$ holds as a theorem, with no remaining ambiguity in the probabilistic model.
This provides a formal criterion for comparing the legitimacy of evaluation channels (e.g., human review vs. third-party AI assessment) under explicit structural constraints.
1. Measure-Theoretic Foundation & Model Definition
Notation and Probability Space
- Let $[J] := \{1, \dots, J\}$ be the set of fields with $J \ge 2$. Let $[m] := \{1, \dots, m\}$.
- Let $\mathcal{X} := [-1, 1]$ be the observation space equipped with Borel $\sigma$-algebra $\mathcal{B}(\mathcal{X})$.
- Let $\mathcal{P}(\mathcal{X})$ be the set of all probability measures on $\mathcal{X}$.
- Throughout, $\log$ denotes the natural logarithm.
- Randomness Source: We fix the internal randomization space $(\mathcal{U}, \mathcal{B}(\mathcal{U}), \lambda) := ([0, 1]^{T+1}, \mathcal{B}, \text{Leb}^{\otimes (T+1)})$. Let $U = (U_1, \dots, U_T, U_{T+1}) \sim \lambda$ with independent coordinates.
Definition 1 (Data-Generating Families and Environment)
Fix margin $\mu_+ \in (0, 1]$ and contamination rate $\alpha \in [0, 1)$.
- Clean Families:
$\mathcal{P}_1 := \{P \in \mathcal{P}(\mathcal{X}) : \mathbb{E}_P[X] \ge \mu_+\}$,
$\mathcal{P}_0 := \{P \in \mathcal{P}(\mathcal{X}) : \mathbb{E}_P[X] \le -\mu_+\}$.
- Contamination Family: $\mathcal{Q} := \mathcal{P}(\mathcal{X})$.
- Parameter Space: $\Theta := \{0, 1\}^J$. Truth vector $\vartheta = (\vartheta_1, \dots, \vartheta_J) \in \Theta$.
- Environment Space: $\Omega := \{ \omega=(\vartheta, P, Q) : \vartheta \in \Theta, P \in \prod_{j=1}^J \mathcal{P}_{\vartheta_j}, Q \in \mathcal{Q}^J \}$.
- Data Generation: For each $\omega \in \Omega$ and $j \in [J]$, the observation follows the mixture measure:
$$ M_{j,\omega} := (1-\alpha)P_j + \alpha Q_j. $$
- Target: $g(\vartheta) := \bigwedge_{j=1}^J \vartheta_j \in \{0, 1\}$. Loss $L(\hat{g}, \vartheta) = \mathbf{1}\{\hat{g} \neq g(\vartheta)\}$.
Definition 1.1 (Sampling Mechanism)
- Algorithm sampling: The full observation is $X=(X_{j,i})_{j\in[J],\,i\in[m]}\in\mathcal{X}^{Jm}$.
$$ X_{j,i}\stackrel{\text{i.i.d.}}{\sim} M_{j,\omega}, \quad \text{independent across }(j,i). $$
- Human query sampling: Each query to field $j_t$ yields a fresh sample $x_t$.
Conditional on $\omega$ and $U$ (and thus on the query sequence $j_1, \dots, j_T$), the samples $x_1, \dots, x_T$ are independent and satisfy $x_t \sim M_{j_t, \omega}$.
2. Procedures and Channels as Statistical Experiments
Definition 2 (Human Protocol $\Pi_H$ and Admissible Class)
Fix $T\in\mathbb{N}$ (total queries) and unique-field budget $B\in\{0,\dots,J-1\}$.
(W.L.O.G. assume $B \le T$).
A human procedure $\Pi_H=(\psi_1,\dots,\psi_T,\phi)$ consists of Borel measurable maps
$$\psi_t:([J]\times\mathcal{X})^{t-1}\times[0,1]\to[J],\quad
\phi:([J]\times\mathcal{X})^{T}\times[0,1]\to\{0,1\}.$$
Given $U=(U_1,\dots,U_T,U_{T+1})$, define iteratively
$$j_t=\psi_t(H_{t-1},U_t),\qquad x_t\sim M_{j_t,\omega},$$
generating a history $H_T \in \mathcal{Y}_H := ([J]\times\mathcal{X})^T$.
Define the extended outcome space $\tilde{\mathcal{Y}}_H := \mathcal{Y}_H \times [0, 1]$ equipped with the product measure $\tilde{\mathbb{P}}^{\Pi_H}_\omega := \mathbb{P}^{\Pi_H}_\omega \otimes \text{Leb}$, governing the pair $(H_T, U_{T+1})$.
Let $S_T:=\{j_t:1\le t\le T\}$. The admissible class is
$$\mathrm{Adm}_H(T,B):=\{\Pi_H:\ \forall \omega\in\Omega,\ \mathbb{P}^{\Pi_H}_\omega(|S_T|\le B)=1\}.$$
Definition 3 (Algorithm Protocol $\Pi_A$)
The algorithm observes $X \in \mathcal{Y}_A := \mathcal{X}^{Jm}$. The induced law is $P_\omega := \bigotimes_{j=1}^J M_{j,\omega}^{\otimes m}$.
A procedure is a Borel measurable map $\delta: \mathcal{Y}_A \to \{0, 1\}$.
The admissible class is $\text{Adm}_A(m)$, the set of all such maps.
Definition 4 (Channels, Estimators, Risk, Minimax Risk)
Let $g(\vartheta)=\bigwedge_{j=1}^J\vartheta_j\in\{0,1\}$.
Human estimator: for $\Pi_H=(\psi,\phi)$, define
$$ \hat{g}_H := \phi(H_T, U_{T+1}) \in \{0, 1\}. $$
Algorithm estimator: for $\delta$, define
$$ \hat{g}_A := \delta(X) \in \{0, 1\}. $$
Define the (0–1) risk:
$$ R_H(\Pi_H;\omega) := \mathbb{E}_{\tilde{\mathbb{P}}^{\Pi_H}_\omega}[\mathbf{1}\{\phi(H_T, U_{T+1}) \neq g(\vartheta)\}], \quad R_A(\delta;\omega) := \mathbb{E}_{P_\omega}[\mathbf{1}\{\hat{g}_A \neq g(\vartheta)\}]. $$
Then the minimax risks are
$$ \mathfrak{R}^\star(\mathsf{Ch}_H(T,B)) := \inf_{\Pi_H\in \mathrm{Adm}_H(T,B)}\ \sup_{\omega\in\Omega} R_H(\Pi_H;\omega), $$
$$ \mathfrak{R}^\star(\mathsf{Ch}_A(m)) := \inf_{\delta\in \mathrm{Adm}_A(m)}\ \sup_{\omega\in\Omega} R_A(\delta;\omega). $$
Lemma IT (Explicit kernel construction)
Fix $\Pi_H$. The measure $\mathbb{P}^{\Pi_H}_\omega$ on $\mathcal{Y}_H$ is uniquely constructed via the Ionescu-Tulcea theorem using kernels $K_t(h_{t-1}, u_t; \cdot)$ determined by $\psi_t$ and $M_{j,\omega}$.
Finally, since $U_{T+1} \sim \text{Unif}[0,1]$ is independent of the history, the product extension yields a unique induced law $\tilde{\mathbb{P}}^{\Pi_H}_\omega$ on $\tilde{\mathcal{Y}}_H$.
Lemma 0 (Restriction Monotonicity)
For any subset $\Omega' \subset \Omega$, $\inf_{\Pi} \sup_{\omega \in \Omega} R(\Pi; \omega) \ge \inf_{\Pi} \sup_{\omega \in \Omega'} R(\Pi; \omega)$.
3. Sharp Lower Bound for Human Channel
Definition 5 (Noiseless Subclass $\Omega_H^0$)
Define $\Omega_H^0\subset\Omega$ by requiring, for every $j\in[J]$,
$$P_j=\delta_{+1}\ \text{if }\vartheta_j=1,\qquad P_j=\delta_{-1}\ \text{if }\vartheta_j=0,$$
and additionally $Q_j=P_j$ for all $j$.
Hence for every $\omega\in\Omega_H^0$, $M_{j,\omega}=P_j$, so observations are deterministic given $\vartheta$.
Lemma H1 (Coverage Bound)
Fix an admissible $\Pi_H$. Define $p_j := \mathbb{P}^{\Pi_H}_{\omega^{(1)}}(j \in S_T)$ under the "all-true" environment $\omega^{(1)}$.
Then $\sum_{j=1}^J p_j = \mathbb{E}^{\Pi_H}_{\omega^{(1)}}[|S_T|] \le B$.
Thus, there exists $K \in \arg\min_{j} p_j$ such that $\mathbb{P}^{\Pi_H}_{\omega^{(1)}}(K \notin S_T) = 1 - p_K \ge 1 - B/J$.
Lemma H1' (Hitting Probability Invariance)
Let $\omega^{(0,K)}$ be the environment where only $\vartheta_K=0$.
For any $K$, $\mathbb{P}^{\Pi_H}_{\omega^{(1)}}(K \in S_T) = \mathbb{P}^{\Pi_H}_{\omega^{(0,K)}}(K \in S_T)$.
Proof:
Work on $\Omega_H^0$. Couple the two experiments under $\omega^{(1)}$ and $\omega^{(0,K)}$ by using the same internal randomness $U$.
Until $K$ is queried, observations (all +1) and thus histories are identical.
The event $\{K \in S_T\}$ is determined by the query sequence, which remains identical as long as $K$ is not queried. Since the event occurrence implies $K$ is queried, and the path to that query is identical, the probabilities are equal.
Lemma H2 (Joint Event Identity)
On the event $E = \{K \notin S_T\}$, the full pair $(H_T, U_{T+1})$ is identical under $\omega^{(1)}$ and $\omega^{(0,K)}$ for the same $U$.
Thus, for any event $A$ defined by the estimator $\hat{g}_H$, $\mathbf{1}_A(\omega^{(1)}) \mathbf{1}_E = \mathbf{1}_A(\omega^{(0,K)}) \mathbf{1}_E$.
Theorem H0 (Sharp Minimax Value on $\Omega_H^0$)
$$ \inf_{\Pi_H \in \text{Adm}_H(T, B)} \sup_{\omega \in \Omega_H^0} R_H(\Pi_H; \omega) = \frac{1 - B/J}{2 - B/J}. $$
Lemma H0-Red (Two-Point Reduction on $\Omega_H^0$)
For any admissible $\Pi_H$, $\sup_{\omega\in\Omega_H^0} R(\Pi_H;\omega)$ is attained by the sub-problem involving only $\omega^{(1)}$ and $\{\omega^{(0,K)}\}_K$.
Proof (Lower Bound & Tightness):
1. Lower Bound:
Define the event $A := \{ \hat{g}_H = 1 \}$. Let $p = \mathbb{P}^{\Pi_H}_{\omega^{(1)}}(A)$.
Risk at $\omega^{(1)}$ is $1-p$.
For $\omega^{(0,K)}$, risk is $\mathbb{P}^{\Pi_H}_{\omega^{(0,K)}}(A)$.
We have $\sum_{k} \mathbb{P}^{\Pi_H}_{\omega^{(1)}}(A \cap \{k \in S_T\}) = \mathbb{E}[|S_T|\mathbf{1}_A] \le Bp$.
So there exists $K$ such that $\mathbb{P}^{\Pi_H}_{\omega^{(1)}}(A \cap \{K \in S_T\}) \le (B/J)p$.
Then $\mathbb{P}^{\Pi_H}_{\omega^{(1)}}(A \cap \{K \notin S_T\}) \ge (1 - B/J)p$.
By Lemma H2, $\mathbb{P}^{\Pi_H}_{\omega^{(0,K)}}(A) \ge \mathbb{P}^{\Pi_H}_{\omega^{(0,K)}}(A \cap \{K \notin S_T\}) = \mathbb{P}^{\Pi_H}_{\omega^{(1)}}(A \cap \{K \notin S_T\}) \ge (1 - B/J)p$.
Minimizing $\max\{1-p, (1-B/J)p\}$ yields $\frac{1-B/J}{2-B/J}$.
2. Achievability:
Protocol: Use $U_1$ to pick $S = \Gamma(U_1)$. Query all $j \in S$. If any $-1$ seen, output 0. Else use $U_{T+1}$ to output 1 with prob $p^\star = \frac{1}{2-B/J}$.
This achieves the bound exactly.
Corollary H (General Lower Bound)
By Lemma 0, $\mathfrak{R}^\star(\mathsf{Ch}_H(T, B)) \ge \frac{1 - B/J}{2 - B/J} \ge \frac{1}{2}\left(1 - \frac{B}{J}\right)$.
4. Algorithm Channel: Upper and Lower Bounds
Theorem A1 (Explicit Upper Bound)
There exists $\delta \in \text{Adm}_A(m)$ such that
$$ \mathfrak{R}^\star(\mathsf{Ch}_A(m)) \le J \exp\left(-\frac{1}{2} m \Delta^2\right). $$
Proof (Explicit rule and bound):
For each field $j$, let $\bar X_j:=\frac{1}{m}\sum_{i=1}^m X_{j,i}$ and define the decision rule
$$ \delta(X)=\mathbf{1}\{\forall j\in[J],\ \bar X_j \ge 0\}. $$
Inf/Sup of $\mathbb{E}_Q[X]$ over $\mathcal{Q}$ are $-1$ and $1$.
For $\vartheta_j=1$, $\mathbb{E}_{P_\omega}[X] \ge (1-\alpha)\mu_+ + \alpha(-1) = \Delta.$
For $\vartheta_j=0$, $\mathbb{E}_{P_\omega}[X] \le -\Delta.$
Since $X \in [-1, 1]$ (range 2), Hoeffding's inequality gives
$$ \mathbb{P}_{P_\omega}[\bar X_j < 0 \mid \vartheta_j=1] \le \exp\left(-\frac{2(m\Delta)^2}{m \cdot 2^2}\right) = \exp\left(-\frac{1}{2}m\Delta^2\right). $$
Union bound gives the result.
Lemma A2-Realize (Rademacher Subclass Realizability)
Assume $\Delta \in (0, 1)$. The distributions $P_{+,\Delta}(+1) = \frac{1+\Delta}{2}$ and $P_{-,\Delta}(+1) = \frac{1-\Delta}{2}$ are realizable within $\Omega$ by setting $P^{(1)}(+1)=\frac{1+\mu_+}{2}$, $Q=\delta_{-1}$, etc.
Lemma (Chi-square Tensorization)
For probability measures $P,Q$ with $P\ll Q$,
$$1+\chi^2(P^{\otimes m}\|Q^{\otimes m})=\big(1+\chi^2(P\|Q)\big)^m.$$
Lemma (Binary $\chi^2$ for $P_{-,\Delta}$ vs $P_{+,\Delta}$)
Let $P_{+,\Delta}(+1)=\frac{1+\Delta}{2}$ and $P_{-,\Delta}(+1)=\frac{1-\Delta}{2}$ on $\{+1,-1\}$.
Then
$$\chi^2(P_{-,\Delta}\|P_{+,\Delta})=\frac{4\Delta^2}{1-\Delta^2}.$$
Lemma Chi3 (Mixture $\chi^2$ Identity)
Let $P_0 = P_{+,\Delta}^{\otimes Jm}$ and $P_k$ be the law where field $k$ is $P_{-,\Delta}^{\otimes m}$. Let $\bar{P} = \frac{1}{J}\sum P_k$.
Then $\chi^2(\bar{P} \| P_0) = \frac{1}{J}\chi^2(P_1 \| P_0)$.
Let $L_k := dP_k/dP_0$ on the common finite sample space. Then
$$ 1+\chi^2(\bar P \| P_0)=\mathbb{E}_{P_0}\left[\left( \frac{1}{J}\sum_{k=1}^J L_k \right)^2\right]
=\frac{1}{J^2}\sum_{k}\mathbb{E}_{P_0}[L_k^2]+\frac{1}{J^2}\sum_{k \neq \ell}\mathbb{E}_{P_0}[L_k L_\ell]. $$
For $k \neq \ell$, $P_k$ and $P_\ell$ differ from $P_0$ on disjoint field-blocks, so under $P_0$ the likelihood ratios factor and
$\mathbb{E}_{P_0}[L_k L_\ell]=\mathbb{E}_{P_0}[L_k]\mathbb{E}_{P_0}[L_\ell]=1\cdot1=1$.
Also $\mathbb{E}_{P_0}[L_k^2]=1+\chi^2(P_k\|P_0)$, independent of $k$ by symmetry.
Thus
$\chi^2(\bar P \| P_0)= \frac{1}{J}(\chi^2(P_1\|P_0))$.
Lemma (Minimax Dominates Bayes Risk)
For any prior $\pi$ on $\Omega$,
$$\inf_{\Pi}\sup_{\omega\in\Omega}R(\Pi;\omega)\ \ge\ \inf_{\Pi}\mathbb{E}_{\omega\sim\pi}[R(\Pi;\omega)].$$
Lemma TV-$\chi^2$ Inequality
For any probability measures $P,Q$ with $P\ll Q$,
$$\mathrm{TV}(P,Q)\le \frac12\sqrt{\chi^2(P\|Q)}.$$
Consequently, for any binary testing prior, Bayes error satisfies
$$R_{\mathrm{Bayes}}\ \ge\ \frac12\big(1-\mathrm{TV}(P,Q)\big).$$
Theorem A2 (Fundamental Lower Bound)
Assuming $\Delta \in (0,1)$, if $m \le \frac{\log(1+J)}{\log(1+4\Delta^2/(1-\Delta^2))}$, then $\mathfrak{R}^\star(\mathsf{Ch}_A(m)) \ge \frac{1}{4}$.
Proof:
Consider the realizable subclass. Let $P_0$ correspond to $H_1$ (all +) and $P_k$ to $H_0$ (k is -). Define $\bar P = \frac{1}{J}\sum P_k$.
Define prior $\pi$ with mass $1/2$ on $P_0$ and $1/(2J)$ on each $P_k$.
By Lemma (Minimax Dominates Bayes Risk), $\mathfrak{R}^\star \ge \text{Bayes Risk}$.
Bayes risk $\ge \frac{1}{2}(1 - \mathrm{TV}(P_0, \bar{P}))$.
We have $\chi^2(P_k\|P_0) = \left(1+\frac{4\Delta^2}{1-\Delta^2}\right)^m-1$.
By Lemma Chi3, $\chi^2(\bar P\|P_0) = \frac{1}{J}\chi^2(P_k\|P_0)$.
If this is $\le 1$, then $\mathrm{TV} \le 1/2$, yielding Minimax risk $\ge 1/4$.
This condition simplifies to the stated bound on $m$.
5. Main Result and Legitimacy
Main Theorem (Strict Minimax Dominance)
Assume $B < J$ and $\Delta > 0$. If
$$ m > \frac{2}{\Delta^2} \log\left(\frac{J(2-B/J)}{1-B/J}\right) $$
then
$$ \mathfrak{R}^\star(\mathsf{Ch}_A(m)) < \mathfrak{R}^\star(\mathsf{Ch}_H(T, B)). $$
Proof:
The condition on $m$ ensures $J\exp\left(-\frac{1}{2}m\Delta^2\right) < \frac{1-B/J}{2-B/J}$.
Combining Theorem A1 and Corollary H yields the result.
Definition (Cognitive Legitimacy Preorder)
For channels $\mathsf{Ch}_1, \mathsf{Ch}_2$, define $\mathsf{Ch}_1 \succ \mathsf{Ch}_2$ iff $\mathfrak{R}^\star(\mathsf{Ch}_1) < \mathfrak{R}^\star(\mathsf{Ch}_2)$.
We call $\mathsf{Ch}_1$ "strictly more legitimate (in the minimax sense)" than $\mathsf{Ch}_2$.
References
- [1] L. Le Cam, Asymptotic Methods in Statistical Decision Theory, Springer (1986).
- [2] P. J. Huber, "Robust Estimation of a Location Parameter," Annals of Mathematical Statistics (1964).
- [3] A. B. Tsybakov, Introduction to Nonparametric Estimation, Springer (2009).
- [4] W. Hoeffding, "Probability Inequalities for Sums of Bounded Random Variables," JASA (1963).
- [5] A. Ionescu Tulcea, "Mesures dans les espaces produits," Atti Accad. Naz. Lincei Cl. Sci. Fis. Mat. Nat. (1949).