The Problem: Simulated Proofs in a Real Pipeline
Every authentication in the H33 pipeline passes through three stages: Fully Homomorphic Encryption for biometric matching on encrypted data, a Zero-Knowledge Proof for verification without data exposure, and a post-quantum Dilithium signature for attestation. Two of those three stages have always been real cryptography. The ZKP stage was not.
Prior to v10.0, the zero-knowledge step computed a SHA3-256 hash of the biometric comparison result and called it a "proof." A verifier would recompute the same hash and compare bits. There were no algebraic constraints, no polynomial commitments, no proximity proofs. The verification was sound in the sense that it detected tampering, but it carried none of the mathematical guarantees that define a zero-knowledge proof system.
H33 v10.0 replaces the simulated ZKP module entirely with a production STARK prover and verifier. The new system generates real proofs with algebraic constraints over BLS12-381, Fiat-Shamir transcript binding, FRI polynomial commitments, and Poseidon hash chains. Every biometric comparison is now provably correct under the STARK security model.
What the STARK Proves
The core operation in biometric authentication is cosine similarity: given two embedding vectors (enrolled and fresh), compute their dot product and normalize by their magnitudes. The STARK proof system enforces that this computation was performed correctly through an Algebraic Intermediate Representation (AIR) with seven columns and five transition constraints.
Execution Trace: 7-Column Layout
Each row of the execution trace represents one step of the biometric computation. For a 128-dimensional embedding, the trace has 131 rows (128 dimension rows plus 3 final assertion rows), padded to the next power of two.
| Column | Name | Purpose |
|---|---|---|
| 0 | enrolled_i | Quantized enrolled embedding component |
| 1 | fresh_i | Quantized fresh embedding component |
| 2 | dot_acc | Running dot product accumulator |
| 3 | norm_a_acc | Running squared norm of enrolled vector |
| 4 | norm_b_acc | Running squared norm of fresh vector |
| 5 | poseidon_state | Poseidon hash chain for cryptographic binding |
| 6 | step | Step counter (0 through D-1) |
Five Transition Constraints
At every row i where i < D (the embedding dimension), five algebraic constraints must hold simultaneously:
Algebraic Constraints (enforced at every step)
- Dot product accumulation:
dot_acc[i+1] = dot_acc[i] + enrolled_i[i] * fresh_i[i] - Enrolled norm accumulation:
norm_a_acc[i+1] = norm_a_acc[i] + enrolled_i[i]² - Fresh norm accumulation:
norm_b_acc[i+1] = norm_b_acc[i] + fresh_i[i]² - Step counter:
step[i+1] = step[i] + 1 - Poseidon binding:
poseidon_state[i+1] = Poseidon(poseidon_state[i], enrolled_i[i], fresh_i[i])
The boundary constraints initialize all accumulators to zero at row 0 and verify the final accumulated values at row D against the claimed public inputs. The Poseidon hash chain in column 5 binds every input component into a single commitment, preventing any substitution of embedding values after the proof is generated.
Seven Public Inputs
The proof exposes seven public values that the verifier checks without seeing any private biometric data:
| # | Public Input | Description |
|---|---|---|
| 1 | match_result | Boolean: did similarity exceed threshold? |
| 2 | threshold_bps | Similarity threshold in basis points |
| 3 | poseidon_commitment | Final Poseidon hash over all input components |
| 4 | dimension | Embedding dimension (128) |
| 5 | final_dot | Final dot product accumulator value |
| 6 | final_norm_a | Final enrolled squared norm |
| 7 | final_norm_b | Final fresh squared norm |
The match assertion uses cross-multiplication to avoid division and square roots in the field: dot² × SCALE² ≥ threshold_bps² × norm_a × norm_b. All arithmetic is performed over the BLS12-381 scalar field (~256-bit prime).
Proof Infrastructure
The STARK proof relies on three cryptographic subsystems that were already present in the H33 codebase and are now wired into the production pipeline:
- FRI (Fast Reed-Solomon IOP of Proximity): Polynomial commitment scheme that proves the constraint composition polynomial has low degree. Uses Merkle tree commitments with SHA3-256 and multiple rounds of folding.
- Fiat-Shamir Transcript: Converts the interactive proof protocol into a non-interactive one. The prover and verifier derive identical random challenges from a shared transcript of commitments.
- Poseidon Hash: A ZK-friendly hash function used for the binding chain in column 5. Efficient inside arithmetic circuits compared to SHA-family hashes.
The entire proof system is post-quantum secure. STARKs rely on collision-resistant hash functions (SHA3-256) rather than elliptic curve pairings or discrete log assumptions. There is no trusted setup ceremony.
Benchmark Methodology
All measurements were taken on March 9, 2026 on a single AWS c8g.metal-48xl instance (192 vCPU, 377 GiB RAM, AWS Graviton4 Neoverse V2). The operating system was Amazon Linux 2023 with the system allocator (not jemalloc). The Rust toolchain was stable with --release optimizations.
Single-thread STARK benchmarks used Criterion.rs v0.5 with 100+ iterations and 5-second measurement windows. Multi-threaded sustained throughput was measured over a continuous 120-second window with per-second granularity. All numbers reported are from the 120-second sustained measurement unless otherwise noted.
STARK Proof Performance (Single-Thread)
| Operation | Latency |
|---|---|
| STARK Generate (128-dim biometric, cold) | 68.093052ms |
| STARK Verify (raw, no cache) | 14.366931ms |
| Cache Cold Miss | 14.400565ms |
| Cache Hot Hit | 1.159µs |
Cold proof generation takes 68.093052ms. This is a one-time cost per unique biometric comparison. Once a proof result is cached, subsequent lookups for the same comparison return in 1.159µs — a speedup of 58,751× over raw generation.
Full Pipeline (120-Second Sustained)
| Pipeline Stage | Latency | % of Total |
|---|---|---|
| FHE Batch (32 users, BFV inner product) | 939µs | 76.2% |
| Dilithium Batch Attestation (sign + verify) | 291µs | 23.6% |
| ZKP STARK (DashMap cached, 0.059µs/lookup) | 1.9µs | 0.2% |
| Full Batch Total (32 users) | 1,232µs | 100% |
| Per-Auth (amortized) | 38.5µs |
Sustained Throughput
| Metric | Value |
|---|---|
| Sustained (120s) | 2,172,518 auth/sec |
| Peak Second | 2,190,496 auth/sec |
| Low Second | 2,159,776 auth/sec |
| Variance | ±0.71% |
| Total in 60s | 130,351,080 authentications |
| Cache Hit Rate | 100% (after warmup) |
| Cache Entries | 3,072 |
Variance Collapse: ±6% to ±0.71%
The most significant outcome of v10.0 is not the headline throughput number — it is the variance. The previous benchmark (v9.0, March 5, 2026) measured ±6% variance over 120 seconds, with sustained throughput of 1,714,496 auth/sec against a peak of 2,154,351. The gap between peak and sustained was caused by thermal throttling under continuous full-load computation on bare metal.
v10.0 sustained throughput of 2,172,518 auth/sec exceeds the v9.0 peak. The variance collapsed from ±6% to ±0.71%, meaning the system now runs at near-identical throughput from the first second to the last. The peak-to-low spread across the entire 120-second window was 30,720 auth/sec (2,190,496 high, 2,159,776 low).
The sustained throughput improvement is +26.72% (2,172,518 vs 1,714,496). At production scale, this translates to 130,351,080 authentications in 60 seconds versus 102,869,760 — an additional 27,481,320 authentications per minute.
Comparison: H33 v10.0 vs Microsoft SEAL
| Metric | H33 v10.0 | Microsoft SEAL | Ratio |
|---|---|---|---|
| Single-Thread Batch (32 users) | 1.232ms | 2.85ms | 2.3× |
| Per-Auth (amortized) | 38.5µs | ~89µs | 2.3× |
| Sustained (120s) | 2,172,518 | ~92,000 | 23.6× |
| PQ Signatures | Dilithium (ML-DSA-65) | None | Included |
| ZK Proofs | Real STARK | None | Included |
| Variance | ±0.71% | N/A | Production-grade |
H33's full pipeline — FHE, STARK proof, and Dilithium attestation combined — runs 2.3× faster single-threaded and 23.6× faster at production scale than SEAL's FHE-only operation. SEAL does not include zero-knowledge proofs or post-quantum signatures.
What the Proof Guarantees
A verified STARK proof from H33 v10.0 provides the following guarantees to any third party, without exposing any biometric data:
- The dot product of the enrolled and fresh vectors was computed correctly over all 128 dimensions
- The squared norms of both vectors were accumulated correctly
- The cosine similarity threshold comparison used the correct accumulated values
- Every input component is bound to the Poseidon commitment chain — no values were substituted
- The step counter verifies that exactly D dimensions were processed
- The proof was generated non-interactively via Fiat-Shamir (no prover-verifier communication)
- Security is post-quantum (hash-based, no elliptic curve assumptions)
Version History
| Version | Date | Sustained Auth/Sec | Variance | ZKP |
|---|---|---|---|---|
| v7.0 | Feb 14, 2026 | 1,148,018 | ±5% | Simulated (SHA3) |
| v8.0 | Feb 26, 2026 | 1,595,071 | N/A | Simulated (SHA3) |
| v9.0 | Mar 5, 2026 | 1,714,496 | ±6% | Simulated (SHA3) |
| v10.0 | Mar 9, 2026 | 2,172,518 | ±0.71% | Real STARK |
Reproducing the Results
The H33 benchmark suite is deterministic. To reproduce these measurements:
# STARK proof generate + verify (single-thread)
cargo test --release --lib -- zkp::stark::biometric_proof::tests::test_benchmark_stark_proof --nocapture
# Full pipeline (multi-threaded, 120s sustained)
CACHEE_MODE=inprocess cargo run --release --example graviton4_benchThe benchmark requires AWS c8g.metal-48xl (or equivalent 192 vCPU ARM hardware) for the sustained throughput measurement. Single-thread STARK timings are reproducible on any ARM64 system.