You want production-grade biometric authentication with FHE, zero-knowledge proofs, and post-quantum signatures. There are two paths: build it yourself from open-source libraries, or use H33. One path takes 200-900ms per authentication. The other takes 1.28ms.
This post walks through the DIY path, component by component, and explains where the time goes.
The Shopping List
To build what H33's CollectiveAuthority pipeline does, you need six components. Each one is a separate library with its own API, its own data formats, and its own performance characteristics.
The DIY Bill of Materials 6 libraries
Notice the orange lines. Every time data crosses from one library to another, you pay a serialization tax: encode the output of library A into bytes, decode it into library B's format, validate, allocate new memory. These boundaries add up.
Where the Time Actually Goes
Layer 1: FHE (2.85ms)
SEAL's BFV implementation works. Encrypt a biometric template (680µs), compute Euclidean distance on the ciphertext (1,530µs), relinearize (490µs), decrypt the result (150µs). Total: 2.85ms on server hardware. Multiply + relinearize is 71% of SEAL's time.
H33 does the same FHE operations plus threshold decryption, ZK proofs, and PQ signatures in 1.28ms—2.2× faster single-thread. The difference comes from single-modulus Q=56 (vs SEAL's multi-prime Q=109), Montgomery NTT with Harvey lazy reduction (no division in the hot path), no relinearization needed (shallow auth circuit), and batch CBD sampling (5x faster noise generation). At production scale with SIMD batching: 12.5× faster.
Layer 2: Threshold Decryption (~50-100ms)
SEAL doesn't do threshold decryption. The ciphertext decrypts with a single key. For production authentication, that's a non-starter—a single-key architecture means one compromised node exposes every user.
You need Shamir secret sharing with a k-of-n threshold. Building this from scratch means:
- Splitting the secret key into n shares at setup
- Coordinating k decryption parties at runtime
- Combining partial decryptions with Lagrange interpolation
- Validating each share's contribution
A clean implementation in C++ or Rust takes 50-100ms. Most of the time goes to coordination overhead, not the math. H33's integrated threshold runs in ~330µs + ~75µs combine because there's no IPC, no serialization, and the share validation is fused with the combination step.
Layer 3: Zero-Knowledge Proof (~50-200ms)
You need to prove the computation was correct without revealing the inputs. Off-the-shelf options:
- Groth16 (snarkjs): ~100-300ms proof generation for a simple circuit
- PLONK (various): ~80-200ms depending on implementation
- RISC Zero: ~500ms-2s for general programs
H33 uses ZK-STARK, a Circle STARK over the M31 field with Poseidon2 hashing. Proving takes <20ms (async). Verification takes 2.09ns (cached). The proof is ~180KB. This is possible because the circuit is purpose-built for authentication—it's not a general-purpose VM executing arbitrary programs.
<20ms (async) vs 50-200ms
H33's ZK-STARK proof runs asynchronously and verifies in 2.09ns (cached). Purpose-built circuits beat general-purpose VMs by orders of magnitude.
Layer 4: Post-Quantum Signatures (~1-5ms)
liboqs provides reference implementations of NIST post-quantum algorithms. Dilithium sign + verify through liboqs typically runs 1-5ms depending on compilation flags and platform.
H33's native Rust Dilithium runs in 238µs (238µs total, 7.4µs batch-amortized). That's 4-21x faster than liboqs, primarily because H33's implementation uses the same Montgomery NTT infrastructure as the FHE layer—shared twiddle tables, shared SIMD paths, shared memory pools.
Layer 5: Attestation Chain (~5-20ms)
Building a SHA3 attestation chain that ties the FHE result, ZK proof, and PQ signature together into a single verifiable bundle. This is pure glue code, but it involves hashing, serialization, and typically JSON or protobuf encoding for the attestation record.
In a DIY stack, this is where you discover that SEAL outputs seal::Ciphertext objects, your ZK library outputs Proof structs, and liboqs outputs raw byte arrays. Making them talk to each other is the tax you pay for integration.
Layer 6: Biometric Encoding (~50-500ms)
Before any crypto happens, you need to encode the raw biometric signal into a fixed-dimension vector suitable for FHE computation. This includes normalization, quantization, and quality checks. Off-the-shelf face encoding pipelines (FaceNet, ArcFace) run 50-500ms depending on model size and hardware.
H33's encoding pipeline runs in ~300µs because it operates on pre-extracted embeddings with a purpose-built quantization scheme matched to the BFV plaintext space.
The Integration Tax
The individual library latencies explain maybe half the gap. The other half is integration overhead:
| Overhead Source | DIY Cost | H33 Cost |
|---|---|---|
| Serialization between libraries | 5-15ms | 0 (fused pipeline) |
| Memory allocation / copying | 3-10ms | ~50µs (arena pools) |
| Format conversion | 2-8ms | 0 (native types) |
| Error handling / retry logic | 1-5ms | 0 (single Result chain) |
| Thread coordination | 2-10ms | ~10µs (Rayon work-stealing) |
| Total integration overhead | 13-48ms | <100µs |
In a fused pipeline, data stays in the same memory space, in the same type system, in the same thread pool. There are no serialization boundaries because there's nothing to serialize—the output of BFV encrypt is already in the format that FHE distance expects.
What You Give Up Going DIY
Beyond performance, the DIY path has structural gaps:
Missing from the DIY stack
- No unified security proof. Each library has its own security model. The composition may have gaps.
- No shared NTT infrastructure. SEAL's NTT, liboqs's NTT, and your ZK library's NTT are three separate implementations doing the same math.
- No production hardening. Side-channel resistance, constant-time operations, and fault injection detection are per-library concerns.
- No upgrade path. When NIST finalizes ML-DSA-87, you update liboqs. When SEAL releases 5.0, you update SEAL. Version compatibility is your problem.
- 6+ dependencies to audit. Each library has its own CVE surface. Supply chain risk multiplies.
The Math
At the best case (200ms DIY, well-optimized):
200ms ÷ 1.28ms = 139.9x slower
Throughput: 5 auth/sec (DIY) vs — (H33)
Daily: 432K auths (DIY) vs billions/day (H33)
Annual server cost at 10M auths/day:
DIY: ~24 servers × $2.40/hr = $1,382/day
H33: 2 servers × $2.40/hr = $115/day
At the worst case (900ms DIY, reference implementations):
900ms ÷ 1.28ms = 629.4x slower
Throughput: 1.1 auth/sec (DIY) vs — (H33)
Daily: 95K auths (DIY) vs billions/day (H33)
Bottom line
H33 is 104-468× faster than a DIY equivalent stack while providing stronger security guarantees (unified security proof, shared constant-time NTT, single audit surface) and costing a fraction of the infrastructure.
When DIY Makes Sense
To be fair: if you only need FHE without ZK proofs, without post-quantum signatures, and without threshold decryption, then SEAL is a fine library. If you're doing research on new FHE schemes, SEAL is the right tool.
But if you're building production authentication—where you need the complete security stack running at production throughput—the 200-900ms penalty of bolting libraries together isn't a performance problem. It's an architecture problem. And architecture problems don't get fixed with faster hardware.
Skip the Integration
One API call. Seven cryptographic operations. 1.28ms. Get your free API key and see the difference.
Get Free API Key