BenchmarksPricingDocsBlogAbout
Log InGet API Key
AI Security Encryption · 5 min read

Is Encryption Enough to Protect AI Data?

No. Standard encryption (AES, TLS) protects data at rest and in transit. But every AI model requires data to be decrypted before processing. That decryption creates a plaintext window where data exists unprotected in memory, cache, and logs. Encryption stops exactly where AI risk begins—at the point of computation. The solution is encryption that persists during computation: fully homomorphic encryption.

At Rest
Protected
In Transit
Protected
In Use
Exposed
FHE
All three

Encryption is the foundation of data security. Every organization with sensitive data encrypts it—AES-256 for storage, TLS 1.3 for network connections, KMS for key management. These are solved problems and they work. Data sitting on a disk or moving across a network is cryptographically protected. An attacker intercepting encrypted data at rest or in transit gets nothing useful.

But AI does not operate on data at rest or in transit. AI operates on data in use—loaded into memory, fed into model weights, transformed through layers of computation. And every encryption scheme deployed in production today requires decryption before that computation can happen. The data must be plaintext for the model to process it. That requirement creates the exposure window that encryption was supposed to prevent.

The Encryption Gap

Think of data protection as a chain with three links:

Data State Standard Encryption Where It Exists Risk Level
At Rest AES-256, dm-crypt, LUKS Disk, database, object storage Protected
In Transit TLS 1.3, mTLS, WireGuard Network, API calls, replication Protected
In Use None (plaintext required) GPU VRAM, CPU cache, RAM, logs Exposed

The third link is broken. Data is unprotected precisely when AI is processing it. This is not a configuration error or a deployment mistake. It is a fundamental limitation of traditional encryption: the mathematical operations that AES and TLS perform are not compatible with the mathematical operations that AI models need. You cannot run a matrix multiplication on AES ciphertext and get a meaningful result.

Where Plaintext Leaks in AI Systems

The exposure window during AI computation is not a single point of failure. Plaintext data spreads across multiple locations in the compute stack during every inference request.

GPU VRAM

Model inference loads input data directly into GPU memory. For a biometric matching request, the user's face embedding sits in VRAM as a plaintext vector. For a medical imaging model, the patient's scan exists unencrypted in GPU buffers. GPU memory is not cleared between allocations by default—a subsequent process may read residual data from a previous inference call.

KV Cache

Large language models maintain key-value caches that store intermediate representations of input data. These caches persist for the duration of a session and often longer (for efficiency). The KV cache for a single conversation can contain reconstructable representations of every input the user provided. In multi-tenant GPU environments, KV cache isolation failures have been demonstrated in research settings.

CPU Cache and System Memory

Data moves between GPU and CPU during preprocessing, postprocessing, and orchestration. Input validation, tokenization, output formatting—all happen on the CPU with plaintext data in RAM. Side-channel attacks (Spectre, Meltdown, and their successors) have demonstrated the ability to read data from CPU caches across process boundaries.

Observability and Logging

Production AI systems are instrumented. Request logs capture input payloads for debugging. Metrics pipelines record latency distributions with sample data. Error logs dump request context when failures occur. Tracing systems propagate request data across microservices. Every observability layer is a potential plaintext leak. A logging library that captures input tensors for debugging purposes has the same practical effect as a data breach—sensitive data stored in plaintext on a logging backend.

The Core Problem

Adding more encryption at rest or in transit does not help. Encrypting the database with a stronger cipher, adding another TLS layer, or rotating keys more frequently—none of these address the fundamental gap. The data is decrypted at the compute layer because the compute layer requires plaintext. The solution must operate at the compute layer, not around it.

Why More Encryption Does Not Help

Organizations that recognize the in-use exposure gap often try to solve it with more of the same technology:

Each of these measures improves security at the storage or network layer. None of them address the compute layer. They are applying the right solution to the wrong problem.

FHE: The Missing Layer

Fully homomorphic encryption closes the gap by making decryption unnecessary for computation. FHE encodes data into polynomial rings where addition and multiplication on ciphertext correspond exactly to addition and multiplication on plaintext. The server computes on encrypted data and returns an encrypted result. The plaintext never exists outside the key holder's device.

This is not partial encryption. It is not tokenization with a lookup table. It is mathematically provable: the ciphertext is computationally indistinguishable from random noise to anyone without the decryption key, including the server performing the computation. A complete server compromise—root access, memory dumps, disk images—yields nothing.

With FHE, the encryption chain is complete:

Data State With FHE Risk Level
At Rest Encrypted (AES-256 + FHE ciphertext) Protected
In Transit Encrypted (TLS 1.3 + FHE ciphertext) Protected
In Use Encrypted (FHE computation on ciphertext) Protected

GPU VRAM, KV caches, CPU memory, and logs all contain only ciphertext. There is no plaintext to leak because plaintext never exists on the server.

H33: Production-Speed FHE

The historical objection to FHE has been performance. Early implementations were millions of times slower than plaintext computation. That objection no longer holds. H33's optimized BFV scheme processes encrypted operations at 38.5 microseconds each—fast enough for real-time inference at scale.

The full pipeline: FHE encryption, ZK-STARK proof of correct computation, and Dilithium post-quantum signature attestation. One API call. 2,172,518 operations per second sustained on a single AWS Graviton4 instance. No GPU. Per-operation cost below $0.000001.

Encryption is necessary for AI data protection. It is not sufficient. FHE makes it sufficient—by extending cryptographic protection to the one place traditional encryption cannot reach: the computation itself.

Key Takeaway

Standard encryption protects data everywhere except where AI processes it. FHE protects data everywhere including where AI processes it. That distinction is the difference between encrypting the container and encrypting the contents.

Close the Encryption Gap

Protect data at rest, in transit, and in use. FHE + ZK-STARK + Dilithium in a single API call.

Explore Encrypted Compute → Read the Docs Live Demo
Free tier · 1,000 operations/month · No credit card required