Back to Blog

Beyond SIEM: Cryptographic Evidence vs Security Logs

· By Eric Beans, CEO, H33.ai, Inc.

A $12 million wire transfer is disputed eighteen months after it cleared. The counterparty claims the AI system that approved the transaction was operating under a policy that should have flagged the beneficiary's jurisdiction. The bank opens an investigation. Here is what they find: the application logs from that period were rotated and archived, but the archive format changed during a platform migration six months ago and the old logs are no longer parseable by the current tooling. The vendor who operated the compliance engine at the time of the transaction has since been replaced by a different vendor, and the previous vendor's data retention policy only guaranteed twelve months of decision records. The AI model that made the approval has been updated three times since the transaction occurred — the weights, the thresholds, and the feature set have all changed — and no snapshot of the model state at the time of the decision was preserved. Screenshots of the approval screen were submitted as evidence, but the counterparty's counsel argues they could have been fabricated using any number of widely available tools. The database that recorded the transaction details was modified during the investigation itself, as part of a routine schema migration that the operations team did not realize would alter timestamp precision on historical records.

The institution cannot prove what the AI saw. It cannot prove why the decision occurred. It cannot prove whether the record changed. It cannot prove what policy was in effect. It cannot prove who authorized the policy. It cannot prove that the model running at the time of the decision is the same model running today. Eighteen months of normal operations — no breach, no intrusion, no malware — and the institution has no evidence. Not because the evidence was destroyed. Because evidence was never created in the first place.

This is not a security failure. It is a continuity failure. The system worked perfectly while everything was normal. Every component performed its function. The compliance engine evaluated policies. The AI model scored transactions. The database recorded results. The logs captured events. But the moment someone asks the foundational question — what actually happened? — the entire evidence infrastructure collapses. Each system recorded its own account of itself. Each account is stored in a format controlled by the system that created it, on infrastructure managed by the vendor that operated it, under retention policies that may or may not still apply. None of these records are independently verifiable. None are tamper-evident. None can prove that the record you are reading today is the same record that existed at the time of the event.

Logs are notes — a system's own account of itself. They are not evidence.

This is the gap that H33-Truth closes. Not with better logging. Not with longer retention. Not with more audit hooks or compliance dashboards. H33-Truth turns system behavior into cryptographic evidence. Not attestations in the narrow sense — not hashes dropped into an append-only store. Evidence. Proof of what happened, what policy executed, what model ran, what data entered, what state existed, and whether the record changed. Evidence that is independently verifiable, tamper-evident, vendor-independent, and durable across cryptographic eras. Evidence that survives platform migrations, vendor replacements, model updates, schema changes, and the passage of time.

The difference between a log and a proof is the difference between a claim and a fact. H33-Truth produces facts.

The Category: Provable Autonomous Infrastructure

The industry does not lack tools. It lacks a category. There are AI security products that monitor model behavior. There are tokenization platforms that protect data at rest. There are compliance engines that evaluate policies. There are FHE systems that compute on encrypted data. There are attestation services that hash and sign records. Each solves a real problem. None of them, individually or in combination, solve the problem of provable continuity — the ability to prove, at any future point, that a specific sequence of events occurred in a specific order under specific policies with specific authorizations and that no part of that sequence has been altered.

Provable continuity is not AI security. It is not tokenization. It is not compliance automation. It is not FHE. Those are components. The category is provable autonomous infrastructure: systems that act, decide, classify, govern, route, settle, authorize, and enforce policy — without requiring trust in operators, infrastructure, or vendors. Systems where the proof is embedded in the operation, not appended after the fact. Systems where every state transition produces a cryptographic commitment that is independently verifiable without access to the underlying data, the operating platform, or the original vendor.

This is the category H33-Truth creates. Not a better audit trail. Not a compliance dashboard with cryptographic features. A provable autonomous infrastructure layer that sits beneath every decision, every policy evaluation, every authorization, and every state change in institutional systems — and produces evidence that survives time, vendors, platforms, and cryptographic eras.

What We Built Before H33-Truth

H33-Truth did not emerge from nothing. It is the orchestration layer that unifies six years of cryptographic primitives into a coherent trust operating system. Each primitive was powerful on its own. H33-74 distills any computation — three post-quantum signatures, decision payloads, binding metadata — into a 74-byte attestation: 32 bytes on-chain, 42 bytes off-chain. H33-Upstream binds provenance at creation, producing 202-byte commitments that anchor every downstream decision to a cryptographic origin. H33-Agent-Zero makes binding decisions on encrypted data using TFHE, eliminating plaintext exposure from compliance workflows. H33-Q-Sign converts organizational governance actions into cryptographic proof chains. HATS — the H33 Autonomous Trust Standard — provides a publicly available technical conformance standard for continuous AI trustworthiness, with independently verifiable evidence that a system satisfies the standard's defined controls. Cachee delivers post-quantum attested caching at sub-microsecond latency.

Each of these primitives is in production. Each has been benchmarked on Graviton4 metal. Each is covered by our patent portfolio (7 patents pending, 300+ claims). But the gap between them was not more cryptography. The gap was connection. The question was never whether we could attest a decision, or prove provenance, or compute on encrypted data. The question was: when an institution needs to reconstruct what happened across six months of automated decisions made by AI agents operating under policies set by humans who delegated authority through organizational hierarchies — can the system prove the entire chain? That is what H33-Truth does.

The Six Layers of H33-Truth

H33-Truth is organized into six layers. Each layer proves a different dimension of institutional truth. Together, they provide complete evidentiary reconstruction: the ability to prove, at any future point, exactly what happened, why it happened, who authorized it, what policy governed it, what model executed it, and whether any part of the record changed.

Layer 1: Policy Graph Engine

Every institutional decision is governed by policy. A wire transfer is approved because a policy says transfers under a certain threshold from accounts in good standing in approved jurisdictions can be auto-approved. An insurance claim is routed because a policy says claims above a certain value with specific indicators must go to a senior adjuster. An AI model is deployed because a policy says models that pass validation with accuracy above a given threshold can be promoted to production. The policy is the reason the decision exists.

In every major institutional system today, policy is a configuration file. It is a set of rules in a database table, or a YAML document in a repository, or a settings page in an admin console. When the policy changes — and it always changes — the previous version is overwritten. The new version takes effect immediately. There is no cryptographic record of what the policy said yesterday. There is no verifiable binding between the policy that was in effect at the time of a decision and the decision itself. When a regulator asks why a specific transaction was approved, the institution can show the current policy. It cannot prove what the policy said when the decision was made.

The Policy Graph Engine makes governance cryptographically replayable. Every policy is versioned. Every version is signed. Every version is attestable via H33-74. When a policy changes, the system does not overwrite the previous version. It appends a new version to an append-only chain, produces a diff attestation that captures exactly what changed between versions, and creates a policy-decision binding that links every decision made under that policy version to the specific version that governed it.

The engine supports 14 rule types and 4 enforcement levels. Rule types cover the full taxonomy of institutional policy: threshold rules, jurisdiction rules, time-window rules, velocity rules, counterparty rules, delegation rules, escalation rules, classification rules, routing rules, retention rules, access rules, rate rules, approval rules, and exception rules. Enforcement levels range from advisory (the policy recommends but does not block) through mandatory (the policy blocks non-compliant actions), with escalation and override levels that require additional authorization signatures.

The question “why was this allowed?” becomes mathematically answerable. The answer is not a log entry. It is a cryptographic proof chain: this decision was made at this time under this policy version, which was signed by these parties, which differed from the previous version by these specific changes, and here is the H33-74 attestation that binds the decision to the policy. Think of it as GitHub for machine governance — every change tracked, every version recoverable, every decision traceable to the exact rule set that produced it.

Layer 2: Agent Governance

AI agents are making consequential decisions in production systems today. They are approving transactions, routing claims, classifying documents, scoring risk, generating reports, and communicating with customers. Each of these agents operates with delegated authority — a human or a policy grants the agent permission to act within certain boundaries. But those boundaries are enforced by the same infrastructure the agent operates on. The agent's compliance with its boundaries is self-reported. The system trusts the agent to stay within its lane because the system designed the lane. There is no independent proof.

Layer 2 makes agent behavior provable. Every agent receives a capability token — a cryptographic credential that specifies exactly what the agent is authorized to do, with what data, up to what value, at what rate, during what time window. The capability token is not a configuration setting. It is a signed, attestable, independently verifiable credential that the agent must present for every action. The infrastructure validates the token before executing the action. If the token does not cover the action, the action is denied.

The system tracks 11 denial reasons: capability not granted, value limit exceeded, rate limit exceeded, time window expired, jurisdiction restricted, counterparty blocked, policy version mismatch, delegation chain broken, approval required, confidence threshold not met, and drift detected. Each denial produces a cryptographic record that is as verifiable as each approval. The institution can prove not only what the agent did, but what the agent was prevented from doing and why.

Drift detection is the critical capability. The system maintains a behavioral baseline for each agent — a statistical profile of the agent's actions over a defined period. When the agent's current behavior deviates from its baseline beyond a configurable threshold, the system flags the drift, produces a cryptographic commitment of the deviation, and can trigger escalation, rate limiting, or suspension depending on the enforcement level. The baseline itself is attested, so the institution can prove what “normal” looked like at any historical point.

Memory snapshots capture the agent's full state at configurable intervals, and rollback capability allows the institution to restore an agent to any previously attested state. The result: AI agents that cannot silently drift, hallucinate authority, or hide actions. Every action is proven. Every boundary is enforced. Every deviation is detected and recorded.

Layer 3: Decision Explainability

Proving that a decision happened is necessary but not sufficient. Institutions must also prove why. Not in the sense of a natural-language explanation — a post-hoc rationalization generated by a separate system. In the sense of a causal chain: these specific factors, evaluated in this specific order, with these specific thresholds, produced this specific outcome. The decision must be explainable as a deterministic function of its inputs, and that function must be independently verifiable.

Layer 3 turns AI decisions into evidence of causality. Every decision is decomposed into a factor-based decision graph — a directed acyclic graph where each node represents a factor that contributed to the outcome and each edge represents a causal dependency. The graph is not a post-hoc reconstruction. It is produced at decision time, as part of the decision computation, and is attested alongside the decision itself.

The system supports 8 factor types. Comparison factors evaluate whether a value is above, below, or equal to a reference. Threshold factors check whether a value crosses a defined boundary. Match factors compare a value against a set of acceptable values. Classification factors assign a category based on a model's output. Temporal factors evaluate timing conditions — whether an event occurred within a window, whether a sequence matches a pattern. Jurisdictional factors apply location-based rules. Aggregate factors combine multiple inputs into a single score. Velocity factors measure the rate of change of a value over time.

Four derivation logics govern how factors combine into outcomes. AllMustPass requires every factor in the graph to evaluate to true. AnyTriggers produces a positive outcome if any single factor evaluates to true. WeightedScore assigns a weight to each factor and produces a composite score. ThresholdCount requires a minimum number of factors to evaluate to true.

Critically, Layer 3 supports 4 exposure levels that control how much of the decision graph is revealed to different parties. OutcomeOnly reveals only the final decision — approved or denied — without any factor information. FactorLabelsOnly reveals the names of the factors that contributed but not their values or thresholds. ThresholdDirections reveals whether each factor was above or below its threshold but not the actual values. WithConfidence reveals the full factor graph including values, thresholds, and confidence scores. This allows institutions to prove WHY a decision was made without exposing WHAT data drove it — satisfying both explainability requirements and data protection obligations simultaneously.

The AI decision becomes evidence of causality. Not a narrative. Not an explanation. A provable causal chain, attested at decision time, independently verifiable, with configurable disclosure.

Layer 4: Cross-Organization Trust Fabric

Institutions do not operate in isolation. A bank that approves a wire transfer relies on the receiving bank's sanctions screening. An insurer that underwrites a policy relies on the reinsurer's risk assessment. A custodian that settles a trade relies on the clearinghouse's validation. Each of these dependencies is currently managed through trust: Bank A trusts that Bank B performed its screening. The insurer trusts that the reinsurer assessed the risk correctly. The custodian trusts that the clearinghouse validated the trade.

That trust is implemented through PDFs, emails, phone calls, and screenshots. A compliance officer at Bank A emails a compliance officer at Bank B requesting confirmation that a beneficiary was screened. Bank B replies with a PDF letter on letterhead confirming the screening. Bank A stores the PDF in a folder. Eighteen months later, when the wire is disputed, the PDF is the evidence. A PDF on letterhead. That is the evidentiary foundation of correspondent banking.

Layer 4 replaces inter-organizational trust with inter-organizational proof. When Bank A needs to verify that Bank B screened a beneficiary, Bank B does not send a letter. Bank B sends an attestation bundle — a cryptographic proof that the screening occurred, under a specific policy version (proven by Layer 1), by a specific agent or system (proven by Layer 2), with a specific causal chain (proven by Layer 3), at a specific time, with the result attested via H33-74. Bank A does not trust Bank B. Bank A verifies the proof. The verification is mathematical, not institutional. It does not require a relationship. It does not require a phone call. It does not require faith in the other institution's internal controls.

The Trust Fabric supports bilateral trust establishment — two organizations can establish a cryptographic trust relationship by exchanging public keys and agreeing on a verification protocol. Once established, attestation bundles flow between organizations as machine-verifiable proofs. Each organization can independently verify the other's claims without accessing the other's systems, data, or infrastructure. The verification produces its own attestation, creating a bilateral proof chain that both organizations can reference in the event of a dispute.

This changes the architecture of correspondent banking, reinsurance, custody, clearing, and settlement. Every inter-organizational dependency that currently relies on trust — SWIFT messages, FedNow confirmations, reinsurance certificates, custody reports — can be replaced with a cryptographic proof that is independently verifiable, tamper-evident, and permanent. No PDFs. No emails. No screenshots. Just proofs.

Layer 5: Continuous Trust Score

Trust is not binary. An institution is not simply trusted or untrusted. Trust exists on a spectrum that changes over time as the institution's behavior, governance, infrastructure, and compliance posture evolve. Today, trust assessments are periodic — an annual audit, a quarterly review, a point-in-time certification. Between assessments, the institution's trust posture is assumed to be unchanged. It is not measured. It is not proven. It is assumed.

Layer 5 replaces periodic trust assessment with continuous cryptographic trust scoring. The system evaluates 9 components in real time: exposure score (how much data is protected by encryption), governance score (how completely governance actions are proven by Q-Sign), audit score (how thoroughly the decision history is attested), policy score (how current and complete the policy graph is), signing score (how consistently cryptographic signatures are applied), infrastructure score (how robust the underlying systems are), agent score (how well AI agents comply with their capability boundaries), replay score (how completely historical states can be reconstructed), and provenance score (how thoroughly data origins are bound by Upstream).

Three weight presets allow different parties to emphasize different components. The equal preset weights all 9 components equally. The insurance-weighted preset emphasizes exposure, audit, and infrastructure — the components most relevant to cyber insurance underwriting. The financial-weighted preset emphasizes governance, policy, and agent scores — the components most relevant to financial regulators and counterparties.

The composite score produces a letter grade from A through F with configurable thresholds. An A means the institution's trust posture across all measured dimensions is at or near maximum. An F means critical deficiencies exist in multiple dimensions. The score itself is attested via H33-74 — the trust score is not a dashboard metric. It is a cryptographic proof of institutional trust posture at a specific moment in time.

Three interfaces expose the score to different audiences. The dashboard interface shows the institution's own team their current trust posture across all 9 dimensions, with historical trends and actionable recommendations. The verify interface allows counterparties to independently verify the institution's trust score without accessing the institution's systems. The HATS portal interface presents the score to cyber insurers and auditors in the format defined by the H33 Autonomous Trust Standard, enabling continuous underwriting instead of annual assessments.

The continuous trust score transforms cyber insurance from a point-in-time bet into a continuous measurement. Insurers no longer need to assume that the institution's posture between annual audits is unchanged. They can verify it in real time, adjust premiums dynamically, and underwrite based on cryptographic proof rather than questionnaire responses.

Layer 6: Human Authorization Continuity

Every automated decision traces back to a human authorization. An AI agent approves a transaction because a human granted it the authority to do so. A policy auto-routes a claim because a human authored the routing rule. A model classifies a document because a human promoted the model to production. The human authorization is the root of the decision tree. And in every institutional system today, that root is a log entry.

Layer 6 makes human authorization provable. Not who clicked a button — who approved, under what authority, at which role state. The system tracks 12 role types that cover the full taxonomy of institutional authority: executive, compliance officer, risk officer, auditor, board member, committee member, delegate, administrator, operator, reviewer, approver, and custodian. Each role type carries specific authorization boundaries. A compliance officer can approve policy changes but cannot promote AI models. A board member can authorize offerings but cannot modify transaction thresholds. The role boundaries are themselves attested — they are not configuration settings, they are cryptographic commitments.

Ten authorizable actions define what can be approved: policy creation, policy modification, agent deployment, agent suspension, model promotion, model rollback, threshold change, delegation grant, delegation revocation, and exception approval. Each action requires specific role types and can require multiple signers. The system enforces these requirements cryptographically — an action without the required signatures cannot produce a valid attestation.

Delegation chains track the flow of authority from the original grantor through every intermediate delegate. When a CEO delegates wire approval authority to a VP, who delegates it to a department head, who delegates it to a team lead, the full delegation chain is recorded as a proof chain. Each delegation is signed by the delegating party, bounded by the scope of the delegator's own authority (you cannot delegate more than you have), and time-limited with configurable expiration. Delegation depth is tracked — the system knows that a specific authorization is three levels removed from the original grant. Revocation propagates: if the VP's delegation is revoked, every downstream delegation is automatically invalidated and the invalidation produces its own attestation.

Role state is reconstructable at any past timestamp. The system can answer the question: on March 15 at 2:47 PM, who held the compliance officer role, what was their authorization scope, and was the delegation chain from the board to the compliance officer unbroken? The answer is not a database query. It is a cryptographic proof that the role state existed as stated at the stated time.

The approval chain is not a log. It is a proof chain. Every human authorization, every delegation, every role assignment, every scope boundary — proven, not recorded.

The Trust Stack: How Everything Connects

H33-Truth does not replace the primitives that came before it. It orchestrates them into a unified evidence pipeline. The flow is directional and each layer builds on the ones beneath it.

H33-Upstream binds provenance at the point of origin — the moment data enters the system. H33-Agent-Zero makes decisions on that data without exposing it. H33-Q-Sign proves that the organizational governance actions authorizing those decisions were properly executed. H33-74 distills every state change — every provenance binding, every encrypted decision, every governance proof — into a 74-byte attestation that is independently verifiable. H33-Truth layers policy, agent behavior, decision causality, cross-organizational trust, continuous scoring, and human authorization on top of this foundation, producing a complete evidentiary record that can be reconstructed at any future point. HATS provides the conformance standard against which the entire stack is measured — independently verifiable evidence that the system satisfies defined controls.

The stack proves different things at different layers. Upstream proves where data came from. Agent-Zero proves that decisions were made without seeing the data. Q-Sign proves that governance was properly authorized. H33-74 proves that each state change is tamper-evident. H33-Truth proves the complete causal chain: what happened, why it happened, who authorized it, what policy governed it, and whether any part of the sequence changed. Together, they provide complete evidentiary reconstruction — from raw data origin through organizational authorization through automated decision through cryptographic attestation.

One incident. Five layers touched. Complete reconstruction. Not from logs. From proofs.

What H33-Truth Does Not Prove

Precision matters. H33-Truth proves specific things, and it is important to be explicit about what falls outside its scope.

H33-Truth does not prove correctness. It does not verify that a decision was the right decision — only that a specific decision was made, by a specific system, under a specific policy, with a specific causal chain. A policy that allows fraud is still a policy. H33-Truth proves the fraudulent decision was made under that policy. It does not prevent the policy from existing.

H33-Truth does not prove truthfulness. If the data entering the system is false — if an investor lies on their accreditation form, if a counterparty submits fabricated documents — H33-Truth proves that the false data entered the system at a specific time and was processed by specific components. It does not verify the accuracy of the input data itself.

H33-Truth does not prove intent. It cannot determine whether a human who authorized a delegation did so knowingly or under coercion, willingly or through deception. It proves the authorization occurred, under what role state, at what time, with what scope. Intent is a legal determination, not a cryptographic one.

H33-Truth does not prove collusion. If two authorized parties collude to approve a fraudulent transaction, both approvals will produce valid attestations. The system proves that the approvals occurred and that both parties had the authority to approve. Detecting collusion requires analysis beyond the scope of provable continuity.

What H33-Truth does prove: what happened, what policy executed, what model ran, what data entered, what state existed, and whether the record changed. That precision — proving exactly these six things and not overclaiming beyond them — is the source of its evidentiary value. Courts, regulators, insurers, and counterparties can rely on H33-Truth proofs precisely because the system does not claim to prove more than it can.

Production Status

H33-Truth is not a whitepaper. It is compiled, tested, and running.

The system comprises 22 Rust files implementing all six layers. 94 tests cover the full surface area — policy graph operations, agent capability enforcement, decision graph construction and verification, trust fabric attestation exchange, trust score computation, and human authorization chain management. Zero test failures.

Concurrency is handled through DashMap lock-free concurrent access throughout all six layers. There are no mutexes in the hot path. State transitions are atomic. Multiple agents, multiple policies, multiple trust score computations can execute concurrently without contention.

Every state change across all six layers produces a SHA3-256 commitment. Every commitment is attestable via H33-74. The attestation chain is append-only. Domain separators ensure that commitments from different layers cannot collide or be confused: 0x31 for Policy, 0x32 for Agent, 0x33 for Decision, 0x34 for Trust Fabric, 0x35 for Trust Score, 0x36 for Org Governance. Each domain separator is registered in the H33-74 attestation protocol, ensuring that a policy commitment cannot be misinterpreted as a decision commitment regardless of content.

The Trust Score UI is live in three interfaces: the institutional dashboard for internal teams, the verify page for counterparties, and the HATS insurer portal for continuous underwriting. All three interfaces read from the same attested trust score, ensuring consistency across audiences.

The system is built on the same Graviton4-optimized Rust stack that runs the rest of the H33 production pipeline. Same cryptographic primitives. Same post-quantum signature families — three independent hardness assumptions. Same 74-byte attestation format. Same SHA3-256 commitment scheme. H33-Truth does not introduce new cryptographic dependencies. It orchestrates the existing ones into a complete evidentiary system.

The End of the Evidence Gap

The industry has spent a decade building systems that record what happened. Better logs. Longer retention. More audit hooks. Fancier dashboards. All of it built on the same fragile assumption: that a system's own account of itself constitutes evidence. It does not. A log is a claim. A database record is a claim. A screenshot is a claim. A PDF is a claim. Claims can be fabricated, altered, rotated, migrated, overwritten, and lost. Claims are not evidence.

H33-Truth proves it.

Six layers. Policy provenance, agent governance, decision causality, cross-organizational trust, continuous scoring, and human authorization continuity. Each layer independently verifiable. Each layer cryptographically bound to the others. Each layer attested via H33-74 with three independent hardness assumptions. Together: complete evidentiary reconstruction. Not from logs. Not from claims. From proofs.

One incident. Six layers. The full causal chain from human authorization through organizational governance through policy execution through agent action through decision factors through cross-organizational verification — reconstructed, verified, and proven. At any future point. Regardless of vendor changes, platform migrations, model updates, or the passage of time.

That is what provable continuity means. That is what H33-Truth delivers.

See H33-Truth in Action

We will walk you through complete evidentiary reconstruction — policy graphs, agent governance, decision explainability, cross-organization trust fabric, continuous trust scoring, and human authorization continuity — on your infrastructure, with your regulatory requirements.

Schedule a Demo