NIST finalized FIPS 203 (ML-KEM) and FIPS 204 (ML-DSA) in August 2024. This is a practical guide to implementing these post-quantum algorithms covering parameter selection, key management, performance, and common pitfalls.
ML-KEM (Module Lattice-based Key Encapsulation Mechanism), formerly CRYSTALS-Kyber, is NIST's standardized post-quantum key encapsulation mechanism replacing ECDH and RSA-KEM. Security relies on the Module Learning With Errors (MLWE) problem, resistant to both classical and quantum attacks.
| Parameter | ML-KEM-512 | ML-KEM-768 | ML-KEM-1024 |
|---|---|---|---|
| Security Level | NIST Level 1 | NIST Level 3 | NIST Level 5 |
| Public Key Size | 800 bytes | 1,184 bytes | 1,568 bytes |
| Ciphertext Size | 768 bytes | 1,088 bytes | 1,568 bytes |
| Shared Secret | 32 bytes | 32 bytes | 32 bytes |
For most applications, ML-KEM-768 (Level 3) provides the best balance of security and performance.
ML-DSA (Module Lattice-based Digital Signature Algorithm), formerly CRYSTALS-Dilithium, replaces RSA, ECDSA, and Ed25519. It uses the "Fiat-Shamir with Aborts" paradigm for secure signature generation.
| Parameter | ML-DSA-44 | ML-DSA-65 | ML-DSA-87 |
|---|---|---|---|
| Security Level | NIST Level 2 | NIST Level 3 | NIST Level 5 |
| Public Key | 1,312 bytes | 1,952 bytes | 2,592 bytes |
| Signature | 2,420 bytes | 3,293 bytes | 4,595 bytes |
Both ML-KEM and ML-DSA require constant-time implementations to prevent timing side-channel attacks. Critical operations that must be constant-time include polynomial multiplication, rejection sampling in ML-DSA, ciphertext comparison in ML-KEM decapsulation, and all private key operations.
Post-quantum keys are larger than classical keys. An ML-DSA-65 public key is 1,952 bytes compared to 32 bytes for Ed25519. Plan for key rotation with clear expiration dates. Implement key versioning so old signatures can be verified with the correct key version. Consider hybrid schemes during transition.
During transition, combine classical and post-quantum algorithms. For key exchange, concatenate ECDH and ML-KEM shared secrets. For signatures, create nested signatures: sign with Ed25519, then sign the result with ML-DSA. H33 uses this approach with three signature families providing three independent hardness assumptions.
ML-KEM and ML-DSA performance is dominated by polynomial arithmetic over the ring Z_q[X]/(X^n + 1). Properly optimized implementations achieve signing speeds of tens of thousands of operations per second. On ARM processors, NEON SIMD accelerates polynomial operations. On x86, AVX2 and AVX-512 provide similar benefits.
H33 uses ML-DSA-65 (FIPS 204) as one of three signature families in production, contributing to the 391-microsecond batch attestation latency for 32 authentications. ML-KEM-768 handles key exchange. All implementations are constant-time Rust with memory zeroization and hardware-specific ARM Graviton4 optimizations.
H33 handles ML-KEM and ML-DSA so you do not have to. One API call. Production-ready.