PricingDemo
Log InGet API Key
Fully Homomorphic Encryption for AI Inference

The Model Never Sees Your Data. The Results Are Proven.

Every AI API call processes your data in plaintext. The model sees the patient record. The provider logs the financial document. The inference server stores the privileged communication. Every single time.

H33 wraps AI inference in Fully Homomorphic Encryption. The model computes on ciphertext and produces ciphertext. It is mathematically incapable of seeing what it processes. H33-74 attestation proves correct execution.

Try Encrypted Inference How It Works
2.29M
Auth/sec on Graviton4
38µs
Per-operation latency
3
Purpose-built FHE engines
74B
Attestation per inference
The Problem

Every AI API call is a data exposure event

You send plaintext to the model. The model processes it. The provider logs it. The inference server caches it. Your data has now been exposed to every layer of the stack. Compliance says "encrypted in transit." That's TLS. The model still sees everything.

📡

API Providers See Everything

When you call GPT-4, Claude, or any hosted model, your prompt arrives in plaintext at the provider's inference server. TLS protects the wire. It does not protect the endpoint. The model, the logging system, and every middleware between you and the GPU has full access to your data.

🔍

Self-Hosted Doesn't Fix It

Running models on your own infrastructure moves the problem — it doesn't solve it. The model still processes plaintext. Your inference servers become a target. A breach of the GPU cluster exposes every input ever processed. The data surface is the same.

No Proof It Was Private

Even if you trust the provider, you cannot prove to an auditor, regulator, or court that the model never accessed your data in plaintext. Trust is not evidence. Compliance requires proof. Today, nobody has it.

Three Engines

The right encryption for every AI workload

Different AI models need different arithmetic. Neural networks need floating-point. Decision trees need exact integers. Classifiers need boolean gates. H33 provides a purpose-built FHE engine for each.

CKKS

Neural Network Inference

Approximate arithmetic with SIMD slot packing

CKKS encodes floating-point vectors into polynomial rings with SIMD slots, enabling parallel computation across thousands of values in a single ciphertext. Neural network layers — matrix multiplications, activations, normalization — execute on encrypted data with controlled precision loss.

SIMD slots for parallel float ops
TFHE

Boolean Classification

Gate-level operations at 768 TPS

TFHE evaluates boolean circuits on encrypted bits. Binary classification, pass/fail determinations, flag-or-clear decisions, and bitwise comparisons run at gate level with programmable bootstrapping. Each gate refreshes noise, enabling arbitrary circuit depth.

768 TPS (16-bit equality)
Flagship Application

Agent-Zero: Encrypted document classification

Agent-Zero classifies documents — contracts, medical records, financial statements, legal filings — without ever seeing the plaintext. The document is FHE-encrypted before it reaches the classification model. The model processes ciphertext, returns an encrypted classification, and the client decrypts locally.

📄

Document In

Client encrypts the document using H33's FHE SDK. The plaintext never leaves the client's boundary. The encrypted representation is a lattice ciphertext indistinguishable from random noise.

🧠

Classification on Ciphertext

Agent-Zero's classification model processes the encrypted document. Feature extraction, embedding computation, and classification scoring all execute on ciphertext. The model is mathematically incapable of seeing the document content.

Proven Result

The encrypted classification result returns to the client for local decryption. An H33-74 attestation proves the computation was correct: input hash committed, model version committed, output hash committed, authority signed with Dilithium.

How It Works

Five steps. Zero plaintext exposure.

From client encryption to verified result, the plaintext never exists outside the client's boundary.

Step 01
Client Encrypts
Data is FHE-encrypted on the client using H33's SDK. The ciphertext is a lattice polynomial indistinguishable from random noise.
Client-side
Step 02
Server Computes
The AI model processes encrypted ciphertext. Homomorphic operations (add, multiply, rotate) execute the model's computation graph on encrypted data.
943µs / 32-user batch
Step 03
Client Decrypts
The encrypted result returns to the client. Only the client's private key can decrypt. H33, the model, and the infrastructure never see the plaintext result.
Client-side
Step 04
H33-74 Attests
A 74-byte attestation is generated: input commitment, model version, output commitment, and Dilithium signature. Proves correct execution without revealing data.
391µs attestation
Step 05
Cachee Caches
The attestation and encrypted result are cached in Cachee for sub-microsecond replay. Repeated queries skip FHE computation entirely.
0.358µs cached lookup
Integration

Encrypt your inference in three lines

Your existing AI call, wrapped. FHE encrypts inputs before they touch the model.

inference.py — encrypted AI inference
from h33 import EncryptedInference

# Initialize with your preferred FHE engine
engine = EncryptedInference(engine="bfv")  # or "ckks", "tfhe"

# Your data never leaves your boundary in plaintext
encrypted_input = engine.encrypt(patient_record)

# Model computes on ciphertext — never sees plaintext
encrypted_result = engine.infer(
    model="classification-v3",
    input=encrypted_input
)

# Decrypt locally — only your key can read the result
result = engine.decrypt(encrypted_result)

# result.classification  — the AI's output
# result.attestation     — 74-byte H33-74 proof of correct execution
# result.model_version   — committed model hash
# result.verify_url      — h33.ai/verify/<proof_id>

Production performance. Not a research prototype.

2,293,766
Auth/sec (Graviton4)
38µs
Per operation
0.358µs
Cached (Cachee)
Why Not Alternatives

The model still sees your data. Unless it's FHE.

TEEs, differential privacy, and federated learning each address a piece of the problem. None of them prevent the model from processing plaintext.

Traditional AI Inference
Client data (plaintext) API
API Model (sees plaintext)
Model Result (logged plaintext)
Provider retains data access
  • Model processes plaintext inputs
  • Provider infrastructure has full data access
  • Breach exposes all historical inputs
  • No cryptographic proof of privacy
H33 Encrypted Inference
Client data FHE encrypt (client-side)
Ciphertext Model (computes on noise)
Encrypted result Client decrypt
H33-74 attestation proves it
  • Model processes ciphertext only
  • Breach exposes random noise, not data
  • Auditor verifies without system access
  • 74-byte proof per inference
Alternatives Compared

Why each alternative falls short

Each approach has legitimate uses. None of them solve the core problem: the model processing plaintext data.

Trusted Execution Environments

VERDICT: Side-channel vulnerable

TEEs (Intel SGX, AMD SEV, ARM TrustZone) create hardware enclaves where code runs in isolation. But the data is still plaintext inside the enclave. Spectre, Meltdown, PLATYPUS, and LVI have repeatedly demonstrated that side-channel attacks can extract secrets from enclaves. TEEs protect against software attacks. They do not protect against hardware-level side channels.

Differential Privacy

VERDICT: Accuracy loss, no data separation

Differential privacy adds calibrated noise to outputs to prevent reconstruction of individual inputs. This is a statistical guarantee, not a cryptographic one. The model still processes plaintext data — it just perturbs the output. Accuracy degrades with stronger privacy guarantees. And there is no proof that specific data was never accessed.

Federated Learning

VERDICT: Model still sees local data

Federated learning distributes training across devices without centralizing raw data. But each local model still processes local plaintext data during training. Gradient attacks can reconstruct training inputs. And at inference time, the model processes plaintext regardless — federated learning is a training technique, not an inference protection.

Use Cases

Encrypted inference across industries

Every industry that uses AI on sensitive data needs encrypted inference. Here's where it matters most.

Healthcare

Encrypted Diagnostic AI

Medical imaging analysis, diagnostic classification, and treatment recommendation models run on FHE-encrypted patient records. The AI produces results without accessing PHI. HIPAA compliance is cryptographic, not contractual.

Finance

Encrypted Credit Scoring

Credit models, risk assessments, and fraud detection run on encrypted financial data using BFV exact arithmetic. The model scores applicants without seeing income, debt ratios, or account balances. Results are bit-perfect.

Legal

Encrypted Document Review

Contract analysis, due diligence, and litigation support AI processes encrypted privileged documents. Attorney-client privilege is maintained because the model is cryptographically incapable of reading the documents it classifies.

Defense

Encrypted Intelligence Analysis

Classification models process encrypted intelligence reports. Analysts receive classifications without exposing source material to the AI system. Compartmentalization is enforced by mathematics, not by policy.

Insurance

Encrypted Claims Triage

Claims adjudication AI processes encrypted policyholder data. The model triages claims, flags anomalies, and recommends actions without accessing personal health information or financial details in plaintext.

Sanctions

Encrypted Sanctions Screening

Transaction screening against sanctions lists runs on encrypted transaction data. The model returns match/no-match on ciphertext. Wire transfer details, beneficiary names, and account numbers are never exposed to the screening system.

Explore the H33 AI platform

See encrypted inference in 10 minutes

Connect your model endpoint. H33 wraps it in FHE. The model processes encrypted data and returns proven results. No refactoring required.

Try Encrypted Inference Schedule Demo