An AI agent that cannot prove what it did is an agent you cannot trust in production. H33 makes every agent action -- every tool call, every policy evaluation, every memory checkpoint -- a hash-chained node in an execution DAG attested with three post-quantum signature families. The decision itself becomes an independently verifiable artifact. Any session can be replayed, diffed, and verified by any party without trusting H33, the agent, or the operator.
Each step produces a cryptographic artifact. The execution DAG grows with every action. At session close, the entire DAG is committed as a single H33-74 proof bundle. Any session can be replayed deterministically.
Every agent receives a canonical name following the H33 naming convention: h33.agent.<org>.<domain>.<role>.<env>.<seq>. The registration creates the agent's identity in the governance graph and binds it to the organization's policy hierarchy. The registration itself is attested with H33-74.
Opening a session creates the root node of the execution DAG. The root contains the agent identity, the session timestamp, the initial policy snapshot, and the parent session hash (if this is a continuation). Every subsequent action appends a node to the DAG. The DAG is append-only by construction: each node includes the hash of its parent, so modifying any node invalidates every descendant.
Every external tool call is wrapped in a ToolEnvelope containing the tool name, request payload hash, response payload hash, latency measurement, and the policy scope that authorized the call. The ToolEnvelope is attested and appended to the DAG. An auditor can verify exactly what the agent sent, what it received, and whether the call was authorized.
Before every action, the governance engine evaluates five policy scopes: organization, team, project, session, and action. Each scope produces a deterministic boolean. This is a gate, not advisory. Denials are attested as structured rejection nodes in the DAG -- proving not just what the agent did, but what it was prevented from doing and why.
At configurable intervals, the agent's working memory is hashed and committed as a checkpoint node in the DAG. The checkpoint captures the agent's state at that moment without exposing the state contents. During replay, the checkpoint hashes serve as intermediate verification points -- if the replayed state hash matches the checkpoint, the execution up to that point is confirmed correct.
When the session closes, the entire execution DAG is committed to a final H33-74 proof bundle. The bundle contains the root hash, the final node hash, the total action count, the total tool call count, and the three PQ signatures. The 74-byte bundle is the entire session's audit trail in a single independently verifiable object.
Any session can be replayed using the H33 Verifier CLI. The replay engine re-executes the decision sequence, re-derives every intermediate hash, and compares against the original attestation. If every node matches, the session is confirmed authentic. If any node diverges, the CLI identifies the exact point of divergence.
The h33 diff command compares two session receipts node-by-node. This is used for regression testing (did the agent's behavior change between versions?), compliance auditing (did the agent's behavior change between policy updates?), and incident investigation (what exactly changed between the known-good session and the suspected-bad session?).
# Replay a session with 100 iterations for statistical confidence
h33 replay session session.json --iterations 100
# Verify a session receipt
h33 verify receipt session-receipt.json
# Diff two session receipts
h33 diff receipt before.json after.json
# Validate conformance vectors
h33 verify conformance agent-vectors-v1.json
The @h33/agent SDK provides the full lifecycle: registration, session management, action attestation, tool envelopes, policy evaluation, memory checkpoints, and replay verification.
import { H33Agent, Session, ToolEnvelope } from '@h33/agent';
// Step 1: Register agent with canonical name
const agent = await H33Agent.register({
name: 'h33.agent.acme.finance.analyst.prod.001',
apiKey: process.env.H33_API_KEY,
endpoint: 'https://api.h33.ai/v1',
policy: {
scopes: ['organization', 'team', 'project', 'session', 'action'],
},
});
// Step 2: Start session
const session: Session = await agent.startSession({
purpose: 'quarterly-revenue-analysis',
parentSession: null, // no continuation
memoryCheckpointInterval: 10, // checkpoint every 10 actions
});
console.log(`Session ${session.id} started`);
console.log(` Root hash: ${session.rootHash}`);
// Step 3: Execute actions with tool calls
const dataResult = await session.action({
type: 'tool_call',
tool: 'database.query',
input: { sql: 'SELECT revenue FROM q1_2026 WHERE region = $1', params: ['EMEA'] },
});
// Every tool call is wrapped in a ToolEnvelope
console.log(` Tool envelope: ${dataResult.toolEnvelope.hash}`);
console.log(` Policy authorized: ${dataResult.policyResult.authorized}`);
console.log(` DAG node: ${dataResult.nodeHash}`);
// Step 4: Agent performs analysis (attested action)
const analysis = await session.action({
type: 'computation',
description: 'Calculate YoY growth rate for EMEA revenue',
input: { currentRevenue: dataResult.output.encrypted, priorRevenue: '...' },
});
// Step 5: Memory checkpoint (automatic at interval, or manual)
const checkpoint = await session.checkpoint();
console.log(` Checkpoint hash: ${checkpoint.hash}`);
console.log(` Actions since last checkpoint: ${checkpoint.actionCount}`);
// Step 6: Close session with H33-74 commitment
const receipt = await session.close();
console.log(`Session closed`);
console.log(` Total actions: ${receipt.actionCount}`);
console.log(` Total tool calls: ${receipt.toolCallCount}`);
console.log(` Final DAG hash: ${receipt.dagHash}`);
console.log(` H33-74 bundle: ${receipt.attestation.bundleSize} bytes`);
console.log(` Replay-deterministic: ${receipt.replayDeterministic}`);
// Step 7: Replay the session programmatically
const replayResult = await agent.replay({
sessionReceipt: receipt,
iterations: 100,
});
console.log(`Replay: ${replayResult.allMatch ? 'PASS' : 'FAIL'}`);
console.log(` Iterations: ${replayResult.iterations}`);
console.log(` Divergence point: ${replayResult.divergenceNode ?? 'none'}`);
// Step 8: Diff two sessions
const diff = await agent.diff({
before: previousReceipt,
after: receipt,
});
console.log(`Diff: ${diff.identical ? 'identical' : diff.changes.length + ' changes'}`);
for (const change of diff.changes) {
console.log(` Node ${change.index}: ${change.type} -- ${change.description}`);
}
# Verify a session receipt via the public API
curl -X POST https://api.h33.ai/v1/verify/session \
-H "Content-Type: application/json" \
-d @session-receipt.json
# Replay a session via the public API
curl -X POST https://api.h33.ai/v1/replay/session \
-H "Content-Type: application/json" \
-d '{"receipt": "session-receipt.json", "iterations": 100}'
Measured on Graviton4 metal (c8g.metal-48xl, 192 vCPUs). Sustained, not burst. Reproducible via the benchmark suite.
The decision itself becomes an independently verifiable artifact. Not the log of the decision. Not a summary of the decision. The decision -- with its inputs, outputs, policy evaluation, tool interactions, and state transitions -- attested, hash-chained, and replayable. At 24.79 million attestations per second, this is not a theoretical capability. It is production infrastructure.
Every component in the agent governance workflow maps to a published specification, a machine-readable schema, or a conformance vector.
| Component | Specification | Conformance Vector |
|---|---|---|
| Agent registration | Agent Governance Spec | agent-register-v1 |
| Session lifecycle | Agent Governance Spec | session-lifecycle-v1 |
| ToolEnvelope | Agent Governance Spec | tool-envelope-v1 |
| 5-scope policy evaluation | Agent Governance Spec | policy-5scope-v1 |
| Memory checkpoints | Agent Governance Spec | memory-checkpoint-v1 |
| Execution DAG | Agent Governance Spec | execution-dag-v1 |
| Session attestation | H33-74 Proof Bundle Spec | session-attest-v1 |
| Deterministic replay | Governance Replay Spec | replay-agent-v1 |
| Adversarial validation | H33-Chaos Spec | chaos-agent-v1 |
Every claim on this page links to a live demo, a specification, a benchmark, or a CLI command you can run yourself.
Answers to operational questions about agent attestation, execution DAGs, deterministic replay, and conformance validation.
Replayable means that given the same session artifact, any independent party can re-execute the agent's decision sequence and arrive at the identical attestation hash. This is not log replay. It is deterministic cryptographic replay: every action, tool call, policy evaluation, and state transition is hash-chained into an execution DAG. The replay engine re-derives every intermediate hash and compares it against the original attestation. If any node diverges, the replay fails and identifies the exact point of divergence.
Every action produces a node in the execution DAG. The node contains: the action type, the input hash, the output hash, the policy evaluation result (5-scope check), the parent node hash, and a timestamp. The node is signed with three post-quantum signature families and compressed to a 74-byte H33-74 proof bundle. The parent hash creates a hash chain -- modifying any earlier node invalidates every subsequent node in the DAG. This is append-only by construction, not by policy.
A ToolEnvelope wraps every external tool call with a cryptographic attestation. When an agent calls an external API, database, or service, the ToolEnvelope captures: the tool name, the request payload hash, the response payload hash, the latency, the policy scope that authorized the call, and the H33-74 attestation. An auditor can confirm not just that the agent made a tool call, but exactly what it sent, what it received, and whether the call was authorized by the governance policy.
Before every action, the governance engine evaluates the proposed action against five policy scopes: organization, team, project, session, and action. Each scope produces a deterministic boolean result. If any scope denies the action, the action does not execute, and the denial itself is attested as a structured rejection node in the execution DAG. The 5-scope evaluation is hash-chained, meaning the policy decision is as replayable as the action itself.
Conformance vectors are canonical input/output pairs that define correct behavior for the agent governance engine. There are 20 conformance vectors covering policy evaluation, session lifecycle, tool authorization, memory checkpointing, and replay determinism. Each vector specifies an input, the expected output, and the expected attestation hash. Vectors are published at /conformance/agent-vectors/v1/ and are machine-verifiable: h33 verify conformance agent-vectors-v1.json.
Run the Prove-Agent demo. Replay a session. Diff two receipts. Validate the conformance vectors. Every claim is backed by a system you can test.