Why Measurement Matters
Governance infrastructure can degrade silently. A signing key expires and nobody rotates it. A policy update fails to propagate to all enforcement points. An attestation chain develops a gap because a detection hook was disabled during a deployment. None of these failures produce errors. They produce absence — missing attestations, stale signatures, inconsistent policy states.
Without measurement, you do not know your governance is degraded until you need it and discover it is broken. The Operational Integrity Score (OIS) is designed to make governance health observable, continuous, and quantitative.
The Six Dimensions
OIS is computed across six independent dimensions. Each dimension measures a specific property of governance health. Each is scored independently on a 0.0-1.0 scale, then combined into a weighted composite.
1. Decision Reproducibility
Can governance decisions be independently reconstructed from the evidence chain? This dimension measures whether the attestation records contain sufficient information for a third-party verifier to replay any decision and arrive at the same outcome. A score of 1.0 means every decision in the measurement window is fully reproducible. A score below 1.0 means some decisions have incomplete evidence or ambiguous inputs.
Measurement method: sample decisions from the attestation chain, run them through the replay engine, compare outputs against recorded outcomes. Any divergence reduces the score.
2. Agent Scope Enforcement
Are agent authority boundaries being maintained? This dimension measures whether every agent action in the measurement window has a corresponding scope check attestation. Missing scope checks indicate that the enforcement layer is not operating on every action.
Measurement method: for every action node in the Agent Execution DAG, verify that a scope check node exists as an immediate ancestor. Missing checks, expired scope objects, and scope objects with gaps all reduce the score.
3. Evidence Survivability
Will the evidence remain verifiable over time? This dimension measures whether the cryptographic components of the evidence chain are using non-deprecated algorithms, unexpired keys, and durable storage. Evidence signed with a key that expires next month has lower survivability than evidence signed with a key valid for 10 years.
Measurement method: inspect signing key validity periods, algorithm deprecation status, storage redundancy levels, and on-chain anchor frequency.
4. Policy Continuity
Are policies consistently applied without gaps? This dimension measures whether there are any intervals where no valid policy was in effect. A policy gap means the system was operating without defined governance rules — actions taken during a gap cannot be evaluated against any standard.
Measurement method: walk the policy version chain, verify that each policy's end-of-validity overlaps with the next policy's start-of-validity. Any gap reduces the score proportionally to the gap duration.
5. Signer Diversity
Are multiple independent signers active? Relying on a single signing key creates a single point of failure and a single point of compromise. This dimension measures the number of independent signing authorities participating in the attestation chain.
Measurement method: count distinct signing keys in the measurement window, verify they are operated by independent parties (different key custodians, different infrastructure), check that no single signer dominates the chain.
6. Verification Independence
Can third parties verify without trusting the operator? This dimension measures whether the evidence chain is structured so that an external party with no relationship to the operator can perform complete verification. Verification that requires operator cooperation (e.g., providing decryption keys, granting API access) reduces independence.
Measurement method: attempt verification using only publicly available information (on-chain anchors, published verification keys, standard algorithms). Any verification step that requires operator assistance reduces the score.
The Scoring Algorithm
The composite OIS is a weighted sum of the six dimension scores:
OIS = w1*decision_reproducibility
+ w2*agent_scope
+ w3*evidence_survivability
+ w4*policy_continuity
+ w5*signer_diversity
+ w6*verification_independence
where sum(w1..w6) = 1.0
Default weights emphasize decision reproducibility (0.25) and verification independence (0.20) as the most critical dimensions. The remaining four dimensions are weighted equally (0.1375 each). Weights are configurable per deployment, but the default profile is recommended for comparability across organizations.
| Dimension | Default Weight | Rationale |
|---|---|---|
| Decision Reproducibility | 0.25 | Core governance property; without it, nothing else matters |
| Verification Independence | 0.20 | Distinguishes real governance from self-reported claims |
| Agent Scope Enforcement | 0.1375 | Critical for AI governance use cases |
| Evidence Survivability | 0.1375 | Long-term value of governance evidence |
| Policy Continuity | 0.1375 | Gaps in policy coverage are gaps in governance |
| Signer Diversity | 0.1375 | Resilience against single-point compromise |
Threshold Classification
The composite OIS maps to three classification levels:
| Score Range | Classification | Meaning |
|---|---|---|
| > 0.95 | Healthy | All governance dimensions functioning normally. Evidence chain is complete, verifiable, and current. |
| 0.80 – 0.95 | Attention | One or more dimensions degraded. Intervention recommended within 24 hours. |
| < 0.80 | Critical | Governance infrastructure has significant gaps. Immediate action required. |
Degradation and Recovery Events
OIS is not a static snapshot. It is computed continuously (default: every 60 seconds). This means degradation events are detected in near-real-time.
A degradation event occurs when the OIS drops below a threshold or when any individual dimension drops by more than 0.1 in a single measurement interval. Degradation events generate alerts and are themselves attested — the fact that governance degraded is part of the permanent evidence chain.
A recovery event occurs when a previously degraded dimension returns to its normal range. Recovery events are also attested, creating a verifiable record of both the degradation and the remediation.
OIS vs. compliance scores. Compliance scores measure adherence to a specific standard at a point in time. OIS measures the continuous operational health of the governance infrastructure itself. A system can be fully compliant at audit time but have a degraded OIS because its governance mechanisms are silently failing between audits.
Dashboard Integration
The OIS is designed to be consumed programmatically. It is a number, not a narrative. This means it can be integrated into:
- Security dashboards — OIS as a panel alongside other operational metrics
- Insurance platforms — Continuous underwriting signal that updates in real-time
- Compliance automation — Trigger remediation workflows when OIS drops below threshold
- Board reporting — Trend lines showing governance health over time
- API access — Programmatic retrieval of current OIS with per-dimension breakdown
Frequently Asked Questions
What is the Operational Integrity Score (OIS)?
A weighted composite score from 0.0 to 1.0 measuring governance infrastructure health across six dimensions. Computed continuously from cryptographic evidence, not from periodic assessments or self-reported questionnaires.
What are the six dimensions of operational integrity?
Decision reproducibility, agent scope enforcement, evidence survivability, policy continuity, signer diversity, and verification independence. Each measures a specific property of governance health on an independent 0.0-1.0 scale.
What do the OIS threshold classifications mean?
Above 0.95 = Healthy (all dimensions functioning). 0.80-0.95 = Attention (degradation detected, intervention recommended). Below 0.80 = Critical (significant governance gaps, immediate action required).
How does OIS differ from compliance scores?
Compliance scores measure adherence to a standard at a point in time. OIS measures continuous operational health of the governance infrastructure itself. A system can be compliant at audit time but have a low OIS due to degradation between audits.
How is OIS used in cyber insurance?
As a continuous underwriting signal. Instead of annual security questionnaires, OIS provides real-time, cryptographically verifiable governance health. A sustained OIS above 0.95 is stronger evidence than any point-in-time audit report.