01The killer query — reproduce_decision(decision_id)
reproduce_decision("decision_loan_84711_recommendation")actor: princ_credit_risk_agent_001
capability: recommend_credit
subject: loan_84711_borrower_principal
policy_ref: pol_credit_underwriting:1
model: model_credit_underwriting v1
(full ModelInfluenceRecord from #174 attached)
threshold: 0.75
responsibility chain: present (actor + supervisor + asset_owner + …)
outcome: recommend_approve
CONFIDENCE: 82/100 (3 of 5 components fully anchored)
02The replay object — what gets reproduced
inputs_hash anchor — score climbs to 20/20 when inline03The money quote
Reproducibility ≠ Justification.
The chain runs again the same way. The score lands at 82/100. The same five features explain the same approval. None of that is a judgment. The proof does not establish that the decision was right, fair, legal, or just. Reproducibility is a structural fact. Justification is a verdict made by competent counsel, regulators, and courts — not by replay engines. This proof is measurement, not judgment.
04The pattern — three money quotes, one corpus
Eric LOCKED June 3 2026: "You now have a recognizable pattern across the corpus."
Each proof's deepest claim arrives with its own honest limit. The pattern says: H33 produces evidence — not verdicts.
05The Reproduction Confidence — measurable rather than rhetorical
inputs_hash anchor present — climbs to 20/20 when inline.pol_credit_underwriting:1. No AST hash anchor — climbs to 20/20 when inline."recommend_approve" signed on the canonical event.06The schema (Eric LOCKED Option C — two tiny additive fields)
"This is one of the rare cases where two tiny schema additions buy a huge amount of future value. Without anchors, you're inferring. With anchors, you're measuring."
PolicyRegister + policy_ast_hash: Option<String> PolicyAmend + policy_ast_hash: Option<String> Decision + inputs_hash: Option<String>
Both fields are Option<String>, both skip-if-none, both backward-compatible. All four prior canonical-continuity-tenant proofs verified byte-identical state_ids under the extended schema.
Plus the data shapes that compose the helper:
enum ReproductionComponentStatus { FullyAnchored, PartiallyAnchored, PointerOnly, Missing }
struct ReproductionComponent { component, status, explanation, score, max_score }
struct DecisionReproductionConfidence { total_score, max_score, components[], caption }
struct DecisionReproduction { decision_*, inputs*, policy_*, model_influence,
decision_threshold, responsibility_chain, outcome,
confidence }
The reproduce_decision(snapshot, decision_id) helper lives in the test harness — it is a composition over existing snapshot fields, keeping the engine surface stable while surfacing the orthogonal-axis affordance.
07The computation axis — where this proof sits
08The canonical continuity tenant — four dimensions, one reality
09What this proof IS and IS NOT
The second proof on H33's orthogonal axis. The proof that turns reproducibility from a rhetorical claim into a measured score. The first place an auditor, regulator, or court can quote a number — 82/100 — and know what its five components mean. The category Eric named: Decision Reproducibility. Applies universally — not narrowed to AI.
A verdict. A determination of fairness or correctness. A guarantee of perfect reproducibility — perfect reproducibility is often impossible (stochastic models, non-deterministic policies, time-dependent inputs). A model re-execution from weights — that surface is named in the score as "not yet anchored" and is honest about it. A substitute for competent counsel, regulators, or model risk committees. Reproducibility ≠ Justification.
10Honest limits (Eric LOCKED — 5 total)
- Confidence is a measurement, not a guarantee. 82/100 means 18 points worth of components are anchored partially or by pointer only — and the page names which.
- Perfect reproduction is often impossible. Stochastic models, randomized policy engines, time-dependent inputs, deprecated dependencies. The proof scores reproducibility measurably, not aspirationally.
- Reproducibility is not Determinism. Two reproductions may legitimately disagree if the original decision was non-deterministic. The proof captures what was bound at decision time.
- Phase E lock open. Per-event signature verification remains the standing honest-limit from L9.
- Reproducibility ≠ Justification — see section 03 above.
11Evidence appendix
| Field | Value |
|---|---|
| state_id at T=2035 | e72d3c0e71a11ce0aaf1e8c9eb5c720aff49a6238c76976b4f4435b50e43bee2 |
| Tenant | tenant_insurance_claim_44962d9b-25f5-5622-bd9a-98d5580bb8a2 (canonical continuity tenant) |
| Tenant root | princ_root_claim_44962d9b-25f5-5622-bd9a-98d5580bb8a2 |
| Decision | decision_loan_84711_recommendation |
| Actor | princ_credit_risk_agent_001 |
| Capability | recommend_credit |
| Subject | loan_84711_borrower_principal_… |
| Outcome | recommend_approve |
| Confidence — Total | 82/100 |
| Confidence — Inputs | 12/20 (PartiallyAnchored — features from #174) |
| Confidence — Policy | 10/20 (PointerOnly — no AST hash) |
| Confidence — Model Influence | 20/20 (FullyAnchored — #174's record) |
| Confidence — Responsibility | 20/20 (FullyAnchored — #14.1's chain) |
| Confidence — Outcome | 20/20 (FullyAnchored — Decision.outcome) |
| T_REPLAY | 2035-06-01 (~4 years post-dissolution per #184) |
| Reconstruction artifact | reconstruction.json |
| Harness | tests/decision_reproducibility_001.rs (scif-backend @ d4937508b) |
12Readiness determination & strategic pause
First Decision Reproducibility: PROVEN IN OPERATION. reproduce_decision returns the structured replay object and a measured 82/100 confidence at T=2035, against the canonical continuity tenant, post-dissolution, with all four prior proofs' state_ids verified byte-identical under the extended schema.
What this unlocks: an auditor — and every other audience inside that umbrella — can now ask "Can you reproduce this?" and receive a deterministic, scored, byte-identically replayable answer. The category Eric named is now standing: Decision Reproducibility. The pattern across the corpus is now recognizable.
What this does not unlock: justifications, fairness verdicts, causality claims, or guarantees of perfect reproduction. Reproducibility ≠ Justification.
Eric Beans, June 3 2026, post-#167: "'Reasoning survives systems' has NOT earned a third proof yet. #174 + #167 prove model influence is replayable and decision outcome is reproducible with measured confidence — strong, but still not full reasoning."
The bar a future proof must clear (LOCKED): "Given the preserved reasoning substrate, the same system can re-run the reasoning path — not merely reconstruct the decision object." That substrate requires: policy AST executable · model weights or deterministic artifact · inputs recoverable or fully anchored · agent prompt/response chain · tool calls · intermediate state · randomness seed (if any) · execution environment. Until then, the candidate stays as candidate — disciplined and powerful.
Next move: not a third orthogonal-axis proof. L9.1 Phase E close — harden the entire corpus's per-event signature verification. Replay Confidence climbs 72 → 100 across all 17+ proofs. "You've expanded the vision. Now harden the entire corpus."
Issued by H33, Inc. · Eric Beans, CEO · 2026-06-03
Independently reconstructable. Inputs: scif-backend @ d4937508b · tests/decision_reproducibility_001.rs · reconstruction.json.