Key Pool Architecture

Post-quantum key generation is expensive. ML-KEM (Kyber) keygen takes 33.3µs. Dilithium keygen takes 36.6µs. For high-throughput applications processing millions of requests per second, these microseconds add up fast.

Our solution: pre-generated key pools. By generating keys in advance during idle time, we reduce effective keygen latency to sub-microsecond levels—a 79-104x improvement.

104x

Signature Pool Speedup

96x

Dilithium Pool Speedup

79x

ML-KEM Pool Speedup

The Problem with On-Demand Keygen

Consider a typical authentication flow:

User initiates authentication
Generate ephemeral Dilithium keypair (36.6µs)
Sign challenge (45µs)
Verify signature (36.9µs)
Generate ML-KEM keypair for session (33.3µs)
Establish encrypted channel (72.6µs)

Key generation alone accounts for 70µs—nearly 30% of a complete authentication. At scale, this becomes the bottleneck. In H33's production pipeline, every authentication must complete BFV fully homomorphic encryption over 128-dimensional biometric vectors, a ZKP lookup verification, and a Dilithium attestation signature—all in ~42µs per user. There is zero budget for on-demand keygen in that hot path. See our production benchmarks for the full throughput analysis.

Key Insight

Post-quantum key generation is dominated by lattice sampling: generating structured noise from centered binomial distributions, then applying Number Theoretic Transforms. These operations are CPU-intensive but independent of user identity—which means they can be pre-computed without any loss of security.

How Key Pools Work

The key insight is that cryptographic key generation doesn't need user-specific input. Keys are random—we can generate them in advance.

// Traditional approach: generate on demand
let keypair = dilithium::keygen();  // 36.6µs

// Pool approach: grab pre-generated key
let keypair = key_pool.acquire();   // 0.35µs

The acquire() call is a single atomic pop from a lock-free MPMC (multi-producer, multi-consumer) queue. No mutex contention, no allocation, no lattice sampling. The cost is bounded by a single CAS (compare-and-swap) instruction plus cache-line transfer—typically under 400 nanoseconds even under heavy contention across 96 Graviton4 cores.

Our pool architecture:

Background generation: Dedicated threads continuously generate keys during idle periods
Lock-free queues: MPMC queues allow concurrent access without blocking
Adaptive sizing: Pool size adjusts based on request rate and key consumption
Secure memory: Keys are stored in mlock'd memory, never swapped to disk

Pool Internals: Replenishment Strategy

A naive pool risks two failure modes: exhaustion under burst traffic, and wasted CPU cycles pre-generating keys that expire unused. H33 solves both with a watermark-based replenishment strategy.

struct KeyPool<K> {
    queue: Arc<ArrayQueue<K>>,  // lock-free bounded queue
    high_watermark: usize,       // start draining background threads
    low_watermark: usize,        // trigger urgent replenishment
    ttl: Duration,               // max key age before discard
}

// Background replenisher (per-core pinned thread)
fn replenish_loop(pool: &KeyPool<DilithiumKeypair>) {
    loop {
        if pool.queue.len() < pool.low_watermark {
            // Urgent: generate in tight loop
            while pool.queue.len() < pool.high_watermark {
                let kp = dilithium::keygen(); // 36.6µs each
                pool.queue.push(kp).ok();
            }
        }
        std::thread::park_timeout(Duration::from_millis(1));
    }
}

On a 96-core Graviton4 instance, H33 dedicates 4 cores to background key generation. Each core produces approximately 27,300 Dilithium keypairs per second. With a pool capacity of 50,000 keys per type and a low watermark at 10,000, the system can absorb traffic bursts of up to 109,000 keys/second for 370ms before any thread blocks—far longer than real-world burst durations at 2.17M auth/sec, where most authentications reuse session keys.

Benchmark Results

January 2026 benchmarks on AWS c8g.metal-48xl (AWS Graviton4, 96 cores):

Operation	Direct	Pool	Speedup
ML-KEM Keygen	33.3 µs	0.42 µs	79x
Dilithium Keygen	36.6 µs	0.38 µs	96x
Signature Pool Keygen	36.6 µs	0.35 µs	104x

Throughput Impact

The throughput improvements are dramatic:

Operation	Single Thread	With Pool	64-Core Max
ML-KEM Keygen	30,030 ops/sec	2.38M ops/sec	152M ops/sec
Dilithium Keygen	27,322 ops/sec	2.63M ops/sec	168M ops/sec

152 Million Keys Per Second

With key pooling and 64 cores, H33 can generate 152 million ML-KEM keypairs per second. That's enough to handle the authentication needs of virtually any application at any scale.

Integration with the H33 Auth Pipeline

Key pools are not a standalone optimization—they are tightly integrated into H33's full authentication stack. Each authentication request flows through three stages: BFV fully homomorphic encryption (N=4096, batching 32 users per ciphertext), a ZKP STARK lookup for identity verification, and a Dilithium attestation signature. The total pipeline completes in ~42µs per user on Graviton4.

Without key pools, the Dilithium keygen for attestation would add 36.6µs—nearly doubling the per-auth latency. With pools, that cost drops to 0.35µs, keeping key generation below 1% of the total pipeline budget. This is what makes 2.17M authentications per second possible on a single c8g.metal-48xl instance.

Key Insight

Key pooling converts a latency-bound operation (36.6µs synchronous keygen) into a throughput-bound operation (background generation on idle cores). The hot path never waits for lattice sampling, NTT transforms, or CSPRNG draws. It simply pops a pre-built keypair off a lock-free queue.

Security Considerations

Pre-generating keys doesn't compromise security:

Keys are ephemeral: Pool keys are used once then destroyed
CSPRNG seeding: Each key uses fresh randomness from the system CSPRNG
Memory protection: Keys are stored in secure, non-swappable memory
No key reuse: Each key is acquired atomically and never returned to the pool

Additionally, pool keys carry a time-to-live (TTL). Any key that remains in the pool beyond its TTL window is discarded and replaced. This bounds the exposure window and ensures that even if an attacker could observe pool state at time t, keys generated at t − TTL are already gone. Combined with mlock() to prevent swap-to-disk and madvise(MADV_DONTDUMP) to exclude keys from core dumps, the pool's memory surface is hardened against both local and physical adversaries.

When to Use Key Pools

Key pools are most beneficial when:

You need ephemeral keys for each request (authentication, key exchange)
Throughput requirements exceed 10,000+ ops/second
Latency budgets are tight (<1ms total)
You have available CPU cycles during idle periods

For long-term identity keys that are generated once and used repeatedly, direct generation is fine—the 33-37µs overhead happens only once.

The pattern also generalizes beyond post-quantum primitives. Any cryptographic operation that is stateless and expensive—RSA keygen (orders of magnitude slower), elliptic curve point generation, even FHE parameter setup—benefits from the same pre-computation approach. The lock-free queue is the universal accelerator; the lattice math is just the most compelling use case because the per-key cost is high enough to matter at scale, but low enough that a small number of background threads can keep the pool saturated.

Experience 104x Faster Key Generation

Key pooling is enabled by default on all H33 API endpoints.

Get Started

Key Pool Architecture:
104x Speedup for Post-Quantum Crypto

The Problem with On-Demand Keygen

How Key Pools Work

Pool Internals: Replenishment Strategy

Benchmark Results

Throughput Impact

152 Million Keys Per Second

Integration with the H33 Auth Pipeline

Security Considerations

When to Use Key Pools

Experience 104x Faster Key Generation

Build With Post-Quantum Security

The Problem with On-Demand Keygen

How Key Pools Work

Pool Internals: Replenishment Strategy

Benchmark Results

Throughput Impact

152 Million Keys Per Second

Integration with the H33 Auth Pipeline

Security Considerations

When to Use Key Pools

Experience 104x Faster Key Generation

Related Articles

Build With Post-Quantum Security

Related Articles