TFHE: Fast Fully Homomorphic Encryption for Boolean Circuits

TFHE (Torus Fully Homomorphic Encryption) takes a different approach than BFV or CKKS. Instead of operating on large integers or real numbers, TFHE efficiently computes on individual bits. This makes it exceptionally fast for certain types of computation.

The Boolean Approach

Traditional FHE schemes batch many values and perform parallel operations. TFHE instead:

Encrypts individual bits
Evaluates logic gates (AND, OR, XOR, NOT)
Enables any computation expressible as a circuit
Achieves very fast gate evaluation

This gate-level granularity means that any function you can represent as a Boolean circuit—from simple comparisons to full ALU operations—can run entirely on encrypted data. The tradeoff is that operations which are naturally parallel (like vector addition across thousands of elements) are more efficient in batched schemes, while branching logic and bit-level decisions are where TFHE dominates.

TFHE Gate Speed

A single TFHE gate evaluation takes approximately 10–20 milliseconds on CPU. With GPU acceleration, this drops to microseconds per gate. By contrast, H33's production BFV pipeline processes a full 32-user biometric batch in ~1,109 µs—roughly ~42 µs per authentication—because BFV amortizes cost across SIMD slots rather than evaluating gates individually.

How TFHE Works

TFHE uses Learning With Errors over the torus (continuous circle group). Key innovations:

Programmable bootstrapping: Refreshes ciphertext while computing a function
Efficient key switching: Fast conversion between encryption formats
Small keys: More practical key sizes than some alternatives

The programmable bootstrapping is particularly powerful—it simultaneously reduces noise and computes a lookup table function. In traditional FHE, bootstrapping is a pure maintenance operation: you pay a large latency cost solely to reduce noise so you can keep computing. TFHE turns that cost into productive work by encoding an arbitrary function into the bootstrapping step itself.

Bootstrapping Under the Hood

The core mechanism relies on a blind rotation of a test polynomial. During bootstrapping, the encrypted input selects a coefficient from this polynomial without revealing which coefficient was chosen. The test polynomial can encode any function from a small domain (typically a few bits) to an output, effectively giving you a free lookup-table evaluation with every noise refresh. This is why TFHE can chain arbitrary non-linear functions without depth limitations—each gate evaluation includes its own bootstrap.

Programmable bootstrapping is TFHE's defining innovation: it transforms the most expensive operation in FHE (noise management) into a vehicle for computation itself.

TFHE vs BFV/CKKS

Different strengths for different use cases:

Property	TFHE	BFV	CKKS
Data type	Individual bits	Batched integers	Approximate reals
Best for	Comparisons, conditionals	Integer arithmetic, inner products	Neural networks, float math
Noise management	Per-gate bootstrapping	Modulus switching	Rescaling
SIMD batching	Limited	N/2 slots (e.g., 4096 slots)	N/2 slots
Non-linear ops	Native (any gate)	Costly (high depth)	Polynomial approx.

TFHE excels when you need operations that are hard to express as polynomials—like comparisons and max/min functions. BFV, on the other hand, is ideal for workloads that can exploit SIMD parallelism. H33's production stack uses BFV with 4,096 slots divided into 128-dimensional biometric vectors, packing 32 users per ciphertext and sustaining 2,172,518 authentications per second on Graviton4 hardware. That kind of throughput comes from amortizing a single NTT-accelerated inner product across many users in parallel—something TFHE's per-bit model cannot match for linear algebra workloads.

Use Cases

Private Comparisons

Comparing encrypted values is natural in TFHE. Because comparison is inherently a bitwise operation (propagating a carry from least significant to most significant bit), TFHE can express it directly while BFV/CKKS require expensive polynomial approximations:

// Compare two encrypted 8-bit integers
encrypted_a_greater = compare(encrypted_a, encrypted_b);
// Returns encrypted bit: 1 if a > b, 0 otherwise

Conditional Logic

If-then-else on encrypted conditions is a single multiplexer gate in TFHE—a primitive that would require multiplicative depth in BFV:

// Encrypted multiplexer
result = mux(encrypted_condition, encrypted_if_true, encrypted_if_false);

Private Database Queries

TFHE enables encrypted SQL-like operations with range queries and comparisons. A WHERE clause like age > 30 AND salary < 100000 translates directly into a chain of comparison and AND gates, each operating entirely on ciphertext.

Implementation Libraries

Several high-quality TFHE implementations exist:

TFHE-rs: Zama's Rust implementation with excellent documentation and GPU support
Concrete: Zama's compiler for FHE programs, which compiles Python to optimized TFHE circuits
tfhe-lib: The original C++ reference implementation from the academic paper authors
OpenFHE: A broader C++ library that includes TFHE alongside BFV and CKKS

Production Note

H33's production pipeline uses BFV (not TFHE) because biometric authentication is an inner-product workload—exactly the kind of batched linear algebra where BFV's SIMD slots shine. The stack pairs BFV FHE with in-process DashMap ZKP lookups at 0.085 µs per lookup and Dilithium post-quantum attestation, delivering a fully post-quantum-secure pipeline at ~42 µs per auth.

Combining with Other Schemes

Advanced applications sometimes combine schemes to exploit the strengths of each:

CKKS for neural network layers—matrix multiplications and activations via polynomial approximation
TFHE for comparison operations—argmax, thresholding, and conditional branching
Scheme switching between them—converting ciphertext formats at layer boundaries

This hybrid approach gets the best of each scheme's strengths. For example, a private machine learning inference pipeline might run dense layers in CKKS, then switch to TFHE for a ReLU activation (which is a simple comparison against zero), then switch back to CKKS for the next dense layer. The switching cost is non-trivial but often less expensive than approximating non-linear functions with high-degree polynomials in a single scheme.

Performance Optimization

To maximize TFHE performance:

Minimize circuit depth where possible—fewer sequential gates means lower total bootstrapping cost
Use GPU acceleration for parallel gate evaluation; modern GPUs can evaluate thousands of gates concurrently
Consider multi-threading for independent subcircuits that share no data dependencies
Profile to identify bottleneck gates—a single critical path determines total latency
Batch bootstrapping when evaluating the same function on multiple inputs to amortize key-switching overhead

TFHE's gate-by-gate approach offers unique advantages for computations requiring comparisons and conditionals. It is a powerful tool in the FHE toolkit—one that complements batched schemes like BFV and CKKS rather than replacing them. The right choice depends on your workload: if you need branching logic on encrypted bits, TFHE is unmatched; if you need high-throughput linear algebra on encrypted vectors, BFV with SIMD batching delivers the kind of scale that powers H33's 1.595 million authentications per second.

Ready to Go Quantum-Secure?

Start protecting your users with post-quantum authentication today. 1,000 free auths, no credit card required.

Get Free API Key →