Every encryption system in production today decrypts data before processing it. TLS terminates at the load balancer. Database encryption decrypts to run a query. Even "encrypted compute" solutions like SGX and TEEs decrypt inside an enclave. Fully Homomorphic Encryption eliminates the decryption step entirely. Here is how it works, when it makes sense, and what it actually costs in production.
Modern infrastructure encrypts data in two states: at rest (stored on disk) and in transit (moving over a network). The third state — data in use, actively being computed on — remains unprotected in nearly every production system deployed today.
Consider a typical authentication flow. A user submits a biometric template. TLS encrypts it between the client and the server. The server's load balancer terminates TLS. The biometric is now plaintext in memory. The application decrypts the stored reference template from the database. Also plaintext. It compares the two in the clear. The result is encrypted again for the return trip. During the comparison — the moment that matters most — both templates exist unencrypted in server memory.
This is not a failure of implementation. It is a structural limitation of every encryption system that requires decryption before computation. AES, ChaCha20, RSA, ECC — they all share the same constraint. To compute on the data, you must first decrypt it. To decrypt it, you need the key. To have the key on the server, you must trust the server.
Trusted Execution Environments (SGX, TrustZone, SEV-SNP) attempted to solve this by creating a hardware enclave where data is decrypted but isolated from the rest of the system. The data still exists in plaintext — inside the enclave. Side-channel attacks against SGX (Spectre, Foreshadow, PLATYPUS, SmashEx) have demonstrated repeatedly that "isolated plaintext" is a weaker guarantee than it appears.
The fundamental question: Can you compute on data that remains encrypted throughout the entire computation? Not encrypted-then-decrypted-in-an-enclave. Encrypted the entire time. The answer is yes. It has been mathematically possible since 2009 and production-viable since approximately 2024.
Fully Homomorphic Encryption is a class of encryption schemes where computation can be performed directly on ciphertext. The result, when decrypted, is mathematically identical (or within a bounded approximation) to what you would get from performing the same computation on the plaintext input.
This is not a metaphor. It is not approximate hand-waving. It is a provable mathematical property. If you encrypt the number 7 and the number 3, an FHE system can multiply the two ciphertexts and produce a ciphertext that decrypts to 21. The server performing the multiplication never learns that the inputs were 7 and 3, never learns the result is 21, and never holds a decryption key.
The mathematical foundation is the Ring Learning with Errors (RLWE) problem. Encryption adds carefully calibrated noise to polynomial representations of the data. Addition and multiplication operations on the ciphertexts propagate through the noise in a controlled way. As long as the accumulated noise stays below a threshold, the result decrypts correctly.
No trusted hardware. No enclaves. No secure channels between computation parties. No key escrow. Pure mathematics. The security guarantee is that breaking the encryption requires solving a lattice problem that no known algorithm — classical or quantum — can solve efficiently.
There is no single FHE scheme that handles all operations. The field has converged on three complementary schemes, each optimized for a different class of computation. Understanding which scheme to use for which operation is the difference between a system that works in production and one that remains a research curiosity.
The Brakerski/Fan-Vercauteren scheme operates on exact integers modulo a plaintext modulus. Addition and multiplication produce exact results — no approximation, no rounding, no precision loss. This makes BFV the right choice for authentication (comparing encrypted biometric templates), counting (encrypted tallies, votes, aggregations), and any operation where the answer must be exactly correct.
BFV supports SIMD batching via the Chinese Remainder Theorem: a single ciphertext can hold thousands of independent integer values, and a single encrypted operation processes all of them simultaneously. H33 batches 32 biometric authentications into one ciphertext. One FHE inner product evaluates all 32 in parallel.
The Cheon-Kim-Kim-Song scheme operates on approximate real and complex numbers. It trades exact precision for the ability to handle floating-point operations — the core requirement for machine learning inference, statistical analysis, and scoring models. CKKS encodes real numbers with a configurable scale factor and manages precision loss through a rescale operation after each multiplication.
A 64-dimensional encrypted dot product — the fundamental operation of encrypted ML inference — executes in 333ms on Graviton4. A full encrypted dense neural network layer (64 inputs, 4 outputs) completes in 1.56 seconds. These are the actual costs of encrypted ML in production, not isolated primitive benchmarks.
The Torus FHE scheme operates on encrypted individual bits. It enables operations that BFV and CKKS cannot perform efficiently: comparisons, thresholds, conditional branching. "Is this encrypted value greater than this encrypted threshold?" is a TFHE operation. BFV can tell you if two values are equal. TFHE can tell you which one is larger.
H33's 96-channel TFHE implementation on Graviton4 achieves 768 TPS for 8-bit greater-than comparisons and 372 TPS for 16-bit operations. The multi-bit GPU variant reaches 1,129 TPS on an A10G.
| Scheme | Best For | Precision | Headline Number |
|---|---|---|---|
| BFV | Authentication, matching, counting | Exact | 2,209,429 auth/sec |
| CKKS | ML inference, statistics, scoring | Approximate | 333ms dot product |
| TFHE | Comparisons, thresholds, decisions | Exact (bitwise) | 768 TPS (8-bit GT) |
No single scheme does everything. A production encrypted compute platform needs all three, with intelligent routing between them. This is exactly what the FHE-IQ routing engine does: given a computation request, it selects the appropriate engine automatically.
Abstract descriptions of FHE obscure the simplicity of the actual data flow. Here is a concrete example: encrypted biometric authentication using BFV, running in production today.
Step 1: Client-side encryption. The client device captures a biometric template — a numerical vector representing the user's fingerprint, face, or iris. The client encrypts this template using the user's public FHE key. The ciphertext leaves the device. The plaintext biometric never does.
Step 2: Server-side encrypted computation. The server holds an encrypted reference template (encrypted under the same key during enrollment). The server computes an FHE inner product between the two encrypted vectors. This is a sequence of encrypted multiplications and additions — all operating on ciphertext. The server produces an encrypted similarity score. At no point does the server hold a decryption key. At no point does any biometric template exist in plaintext on the server.
Step 3: Encrypted result return. The server returns the encrypted similarity score to the client. The client decrypts it locally with the private key that never left the device. The client evaluates the threshold: match or no match.
42 microseconds. Faster than a human blink (300,000 microseconds). The server processes 2.2 million of these per second. And it never sees a single biometric.
This is not a demo. It is the production pipeline running on AWS Graviton4, measured under sustained load for 120 seconds. Every authentication is post-quantum attested via the H33-74 substrate — 74 bytes containing a three-family post-quantum signature that commits the computation result permanently.
FHE has a reputation for being slow. That reputation was earned — early implementations from 2011-2020 were orders of magnitude too slow for production use. The field has moved. Dramatically.
Three architectural advances changed the performance equation:
The result: a full production pipeline — FHE computation, post-quantum attestation, and ZKP verification — executes in 1,345 microseconds for a 32-user batch. That is 1.3 milliseconds for 32 complete encrypted authentications, each post-quantum signed and zero-knowledge verified.
| Stage | Latency | % Pipeline |
|---|---|---|
| FHE batch (32 users, BFV inner product) | 943 μs | 70% |
| Batch attestation (SHA3 + Dilithium sign + verify) | 391 μs | 29% |
| ZKP cached verification | 0.358 μs | <1% |
| Total (32-user batch) | 1,345 μs | 100% |
FHE introduces a key architecture fundamentally different from traditional encryption. Understanding it is critical to understanding why the security model is stronger.
There are three types of FHE keys:
This is the core insight: the server holds keys that let it compute but not read. The compute capability and the read capability are cryptographically separated. No traditional encryption system offers this property. In AES, if you can compute on the data, you can read the data. In FHE, computation and decryption are independently controlled.
For key exchange, H33 uses ML-KEM (FIPS 203) — the NIST-standardized post-quantum key encapsulation mechanism based on Module Learning with Errors. For signatures attesting computation results, three independent post-quantum families operate in parallel: ML-DSA (FIPS 204, lattice-based), FALCON (NTRU-lattice-based), and SLH-DSA (FIPS 205, hash-based). Breaking the attestation requires breaking all three simultaneously — three independent mathematical hardness assumptions.
Keys never travel with the data. The encrypted biometric and the encrypted result travel over the network. The evaluation key sits on the server. The secret key sits on the client. The data and the capability to read it are never in the same place.
FHE is not a universal replacement for traditional encryption. It solves a specific problem — computation on data you cannot see — and it solves it at a specific cost. Knowing when that cost is justified is as important as understanding the technology itself.
The decision heuristic: If a breach of the server would compromise sensitive data because the server decrypts to compute, FHE eliminates that risk. If the server never computes on the data (only stores and retrieves it), traditional encryption is simpler and faster.
The H33 platform exposes encrypted computation through a single API surface. You do not select BFV, CKKS, or TFHE manually. You submit a computation request, and the FHE-IQ routing engine selects the appropriate scheme, parameters, and execution path automatically.
One API call. The router examines the operation type (integer arithmetic, real-number ML, Boolean comparison), the precision requirements, the depth budget, and the throughput target. It selects the engine, configures the parameters, and executes the computation. The developer sees an encrypted input, an encrypted output, and a 74-byte post-quantum attestation receipt proving the computation was performed correctly.
The encrypted biometric authentication pipeline — the one that runs at 2.2 million operations per second — is a single POST /api/v1/substrate/attest call. Encrypt on the client, send the ciphertext, receive the encrypted result and the attestation. Decrypt locally. That is the entire integration.
For teams evaluating encrypted compute platforms, the question to ask every vendor is not "how fast is your multiply?" It is: "What is the end-to-end latency for my actual workload on your production hardware, and how do you attest the result?" If they cannot answer with a measured number on named hardware, the benchmark is not real.
Ready to compute on encrypted data? Schedule a demo to see the full pipeline running live on Graviton4. Or start with the documentation and the H33-74 substrate overview to understand the attestation model.