Agent Infrastructure Benchmark | Graviton4 Metal

Single-Threaded Performance

Baseline Operations

All measurements taken on a single core. No parallelism, no warm-up, cold-start only.

Operation	Scale	Latency	Throughput
Receipt Generation	10K nodes	2.49 µs/node	401K/sec
DAG Insertion	10K nodes	—	243,727 nodes/sec
DAG Insertion	50K nodes	—	237,923 nodes/sec
DAG Insertion	100K nodes	—	237,777 nodes/sec
Replay	10K-node DAG	2.50 ms avg	400 replays/sec
Root Hash	100 nodes	42.15 µs	—
Root Hash	1K nodes	334.91 µs	—
Root Hash	10K nodes	5.36 ms	—
Root Hash	100K nodes	45.70 ms	—
Subgraph Extraction	10K DAG, 10 sessions	392.33 µs avg	—
Integrity Verification	1K nodes	1.31 ms	—
Integrity Verification	10K nodes	15.90 ms	—
Integrity Verification	100K nodes	166.51 ms	—
Session Lifecycle	1K sessions	28.95 µs/session	34.5K/sec

Parallel Scaling

192-Core Throughput Scaling

Thread count swept from 1 to 192. All operations measured under sustained load.

Operation	1 Thread	4	16	64	192	Scaling
Receipt gen	399K/s	1.57M/s	6.29M/s	23.90M/s	24.79M/s	62x
Sessions	33.7K/s	133K/s	514K/s	1.61M/s	2.12M/s	63x
Replays	42.9K/s	100K/s	149K/s	174K/s	—	4x
Verification	440K/s	1.31M/s	2.36M/s	2.37M/s	—	5.4x

Replay and verification scaling is bounded by shared DAG reads. Independent DAG operations (receipts, sessions) scale near-linearly.

Deterministic Replay

Replay Validation

Replay determinism verified across timestamps, concurrency, DAG growth, and insertion ordering.

Replay Determinism Score

100.0%

50-node DAG × 10 timestamps × 100 replays = 50,000 operations

Concurrent Replay

Deterministic

8 threads replaying the same DAG simultaneously

Post-Growth Replay

Deterministic

Replay after DAG growth produces identical output

Every replay produced byte-identical output
Concurrent replay (8 threads): deterministic
Replay after DAG growth: deterministic
Insertion-order independence: verified
Methodology: 50-node DAG × 10 timestamps × 100 replays = 50,000 operations

Adversarial Validation

54 Attack Vectors Tested

Full adversarial test suite covering DAG tampering, replay attacks, session exploitation, and identity forgery.

Category	Tests	Attack Vectors
DAG Tampering	19	Forged receipts, payload mutation, deleted/inserted/reordered nodes, circular dependencies, cross-session injection, 1000-node stress, cascade corruption
Replay Attacks	13	1000-iteration determinism, 8-thread concurrent, fork divergence, timestamp sensitivity, growth stability
Session Attacks	12	Expired session, wrong agent, scope violations, race conditions, delegation depth, chain breaks
Identity Attacks	10	Forged ID, revoked key, impersonation, 100-agent stress, collision resistance

0 failures. Every attack detected. Every corruption caught.

Implications

What These Numbers Mean

A 100,000-action AI agent session can be fully verified in 166ms
Replay at any historical timestamp takes 2.5ms
192-core parallel processing enables 24.79M attestations/sec
Every action is independently verifiable via public proof endpoint
Deterministic replay means any two implementations produce identical output from identical input

Methodology

How We Measured

No warm-up. No cherry-picking. Cold-start measurements. Full methodology published and reproducible.

Hardware: AWS c8g.metal-48xl (Graviton4 Neoverse V2), 192 vCPUs, 377 GiB
Software: Rust 1.95.0, release profile, system allocator (not jemalloc)
Hashing: SHA3-256 with H33_AGENT_V1 domain separator
Ordering: BTreeMap for deterministic key ordering
Timing: std::time::Instant (monotonic clock)
Warm-up: None. Cold-start measurements only.
Selection: No cherry-picking. All runs reported.
Reproducible: cargo run --example agent_bench --release
Source: examples/agent_bench.rs in scif-backend

Reproduce

Reproduce These Results

Clone, build, run. No configuration required.

        
# Clone the repository

git clone https://github.com/H33ai-postquantum/scif-backend

cd scif-backend

# Run the benchmark (release profile)

cargo run --example agent_bench --release

View Conformance Vectors Read Agent Specification Download Verifier CLI View Benchmark Source