The Problem: Plaintext Fields, Paper-Trail Compliance
A typical document validation pipeline works like this: a document is uploaded, OCR or AI extracts the fields, an operator reviews and approves each value, and the system records "who approved what, when" in a database row. The compliance team calls it an audit trail. But from a cryptographic standpoint, it is nothing.
There is no binding between the operator's identity and their decision. An admin with database access can rewrite any validation record retroactively. The extracted SSN sits in a VARCHAR column accessible to every service with a connection string. The "audit log" is a table that anyone with write access can falsify. When regulators or forensic auditors arrive, they are trusting the honesty of the system, not the mathematics.
In mortgage processing alone, the FBI estimates document fraud accounts for over $1 billion in annual losses. Insurance claim fraud adds another $80 billion. The tools meant to prevent this — checkbox compliance, role-based access, audit logs — have not changed meaningfully in two decades. H33-Vault is a fundamentally different architecture.
If a validation decision is not cryptographically bound to the operator who made it, and the field value is not encrypted while being validated, then your compliance posture is a policy document — not a mathematical guarantee.
Architecture: Three Layers, One API
H33-Vault is a joint product of two systems: Cachee.ai (the speed layer) and H33.ai (the trust layer). Every API call passes through both. Speed without trust is a fast database. Trust without speed is unusable in production. H33-Vault delivers both.
Sensitivity Tiers: Not All Fields Are Equal
H33-Vault classifies every extracted field into one of four sensitivity tiers. Each tier determines the encryption, authentication, and proof requirements.
| Tier | Examples | FHE Encrypted | Step-Up Auth | Validation Proof | Signed Audit |
|---|---|---|---|---|---|
| Critical | SSN, bank account, tax ID, biometric | Yes | Yes (15-min fresh) | Yes | Yes |
| High | DOB, passport, medical record, insurance policy | Yes | No | Yes | Yes |
| Medium | Full name, address, phone, email | No | No | No | Yes |
| Standard | Property type, loan term, employer | No | No | No | No |
Classification happens automatically at extraction using pattern matching on field keys and values. An SSN (###-##-####) detected in any field value — even one misclassified as "notes" — is immediately elevated to Critical. There is no configuration to turn this off.
The Validation Proof: SHA3-256 Commitment Chains
When an operator confirms, corrects, or flags a field, H33-Vault generates a cryptographic proof that binds four things together: the operator's identity, the document, the specific field, and the decision.
// 1. Operator Commitment
SHA3-256(operator_secret) // proves identity without revealing secret
// 2. Validation Commitment
SHA3-256(SHA3-256(field_value) || decision_byte) // binds decision to field value
// 3. Fiat-Shamir Challenge
SHA3-256(op_commit || doc_hash || field_hash || val_commit || timestamp)
// 4. Response
SHA3-256(operator_secret || challenge) // proves knowledge of secret
The Fiat-Shamir transform makes this non-interactive: the proof is generated and verified without any back-and-forth between prover and verifier. The challenge is deterministically derived from all public inputs, so any attempt to tamper with any component — the operator commitment, the document hash, the field hash, the decision — produces a mismatched challenge and the proof fails verification.
Traditional ZK proof systems (Groth16, PLONK, BN254) require a trusted setup ceremony, depend on elliptic curves vulnerable to quantum computers, and take 200–500ms to generate a proof. H33-Vault's SHA3-256 commitment scheme generates proofs in under 5 microseconds, requires no trusted setup, and is inherently post-quantum safe (SHA3 is quantum-resistant at 128-bit security). This is not a tradeoff — it is a strictly better design for the validation use case.
For audit scenarios, the full proof can be verified with the operator's secret key, proving that the specific operator who claims to have validated a field actually did so. Without the secret, structural verification still confirms the challenge derivation and integrity of all commitments.
FHE Velocity Counters: Encrypted Behavioral Monitoring
Rate limiting is a critical fraud defense. If an operator is viewing 200 SSNs per hour, something is wrong. But traditional rate limiting stores the count in plaintext — which means anyone with database access can see (or reset) the count.
H33-Vault uses BFV Fully Homomorphic Encryption to maintain velocity counters that are never decrypted during normal operation. The system performs homomorphic addition on encrypted ciphertexts: incrementing a counter from 7 to 8 happens entirely in the encrypted domain. The number "8" is never computed, never stored, never logged in plaintext.
When a threshold check is needed (e.g., "has this operator exceeded 50 Critical field views this hour?"), the system decrypts only the comparison result — not the actual count. Administrative decryption of the raw count requires explicit authorization and produces its own audit entry.
Counter types tracked per operator:
- Validations per hour — total field confirmations/corrections
- Edits per hour — corrected values (high edit rate = flag for review)
- Critical field views per hour — SSN, bank account, tax ID access
- Rejections per hour — flagged values (high rejection rate = training issue or adversarial behavior)
Document Finalization: Merkle Root + Dilithium
Once all fields in a document are validated, the operator triggers finalization. This is the irreversible, cryptographically sealed commitment that the document review is complete.
The finalization pipeline executes three steps in sequence:
- Merkle Root Computation — SHA3-256 hashes of all field ciphertext hashes (or raw values for non-encrypted fields) are assembled into a binary Merkle tree. The root hash represents the integrity of the entire document's validated state. Changing any single field value, even by one byte, produces a completely different root.
- Finalization Proof — A SHA3-256 commitment proof (identical structure to validation proofs) binds the operator, the document, and the Merkle root into a single non-interactive proof.
- Dilithium Signature — The entire finalization payload (document ID, Merkle root, proof, timestamp) is signed with CRYSTALS-Dilithium (ML-DSA-65), the NIST FIPS 204 standardized post-quantum signature algorithm. The keypair is generated per-session and bound to the operator's biometric authentication.
The result is a DocumentFinalizationRecord that includes the Merkle root, the proof, the Dilithium signature, the public key, operator ID, biometric session hash, field count, and timestamp. This record is independently verifiable by any party with the public key — no access to H33 systems required. Optional Solana on-chain attestation provides an immutable public timestamp.
Biometric Step-Up Authentication
Critical fields (SSN, bank account, tax ID) require a biometric step-up before the decrypted value is displayed. Standard session authentication is not sufficient. The operator must re-verify their biometric within a 15-minute freshness window.
When step-up authentication succeeds, the decrypted value is displayed in a "vault mode" overlay that auto-hides after 30 seconds. The value is never written to the DOM's persistent state, never cached in the browser, and the display component unmounts itself when the timer expires. Every vault access is tracked by the FHE velocity counter.
Performance: Cachee Makes It Real-Time
Cryptographic proofs and FHE encryption are meaningless if operators cannot do their jobs. A mortgage processor reviewing 50 documents per day with 30–100 fields per document needs sub-second field loads. This is where Cachee changes the game.
H33-Vault Performance Benchmarks
For a mortgage processor doing 50 documents a day with 60 fields each, the difference between 200ms database queries and Cachee's sub-15ms retrieval saves 15+ minutes of dead time per operator per day. Across a 500-person operations floor, that is 125 hours per day returned to productive work.
Target Verticals
What Competitors Do Not Have
The document validation market is crowded with tools that do OCR and field extraction. None of them have a cryptographic trust layer. The comparison is not between features — it is between architectures.
| Capability | H33-Vault | Competitors |
|---|---|---|
| Field encryption at rest | BFV FHE (compute on ciphertext) | AES-256 or plaintext |
| Validation proofs | SHA3-256 + Fiat-Shamir (instant) | Audit log rows |
| Behavioral monitoring | FHE-encrypted counters | Plaintext counters |
| Document signing | Dilithium ML-DSA-65 (FIPS 204, PQ-safe) | RSA / ECDSA |
| Field retrieval | Cachee L1+L2 (<15ms) | SQL queries (50–200ms) |
| Critical field access | Biometric step-up + vault display | RBAC + password |
| Document integrity | SHA3-256 Merkle tree | Checksums or none |
| On-chain attestation | Solana (optional) | None |
The API
H33-Vault exposes 8 endpoints under /api/v1/contracts, covering the full lifecycle from session creation to document finalization.
POST /session/start Biometric auth + Dilithium keypair
POST /session/step-up Re-verify biometric (15-min window)
DELETE /session/{id} Cleanup session + keys + cache
POST /documents Multipart upload + AES-256-GCM encrypt
GET /documents/{id}/fields L1 -> Cachee -> DB cascade
GET /documents/{doc_id}/fields/{fid}/secure-value FHE decrypt + vault display
POST /documents/{id}/validations Record + proof + velocity counter
POST /documents/{id}/finalize Merkle root + proof + Dilithium sign
Session management, document upload, field retrieval, secure value access, validation submission, and finalization — all behind a single API surface with consistent session tokens and biometric proof headers. The entire system runs on Axum (Rust), with zero Node.js or Python in the hot path.
Built Different
H33-Vault is not a feature bolted onto an existing document processing tool. It is a ground-up cryptographic architecture where every sensitive operation is mathematically verifiable, every sensitive field is encrypted with lattice-based FHE that no quantum computer can crack, and every finalized document carries a signature that will remain valid for decades.
The speed layer (Cachee) ensures that all of this cryptography is invisible to the operator. Fields load in under 15 milliseconds. Proofs generate in under 5 microseconds. The Dilithium signature at finalization takes 191 microseconds. The operator clicks "Confirm" and the entire cryptographic pipeline — commitment, proof, verification, velocity counter increment — completes before their finger lifts off the mouse button.
That is the product. Speed that operators love. Trust that auditors can verify. Cryptography that quantum computers cannot break.