April 13, 2026 ZKP 12 min read

Zero-Knowledge Proofs Over a 74-Byte Substrate Tag

Test 4: Six ZK constraints verified over the 58-byte private witness given only the 32-byte public signing message. A Fiat-Shamir challenge binds the public input. This establishes the constraint structure for a future full STARK proof.

Eric Beans CEO, H33.ai

The first three tests in the substrate series established what the primitive is (a 74-byte signed commitment binding a computation result to a three-family post-quantum signature), how it compresses (21 KB of raw signature material down to 74 bytes persistent), and how it chains (receipt-to-receipt linkage forming a tamper-evident sequence). This post describes Test 4, which asks a different question entirely: can a verifier who sees only the 32-byte public signing message on a blockchain learn something meaningful about the substrate that produced it, without the prover revealing the substrate itself?

The answer is yes. The mechanism is a zero-knowledge proof over the substrate's internal structure. The proof establishes six specific constraints about the 58-byte canonical commitment — its version, its computation type, its content hash binding, its timestamp freshness, and its nonce liveness — without revealing any of those fields to the verifier. The verifier learns that the prover knew a valid substrate. The verifier does not learn what was in it.

This is the fourth post in the substrate proof series. It describes the test, the constraint structure, the Fiat-Shamir transcript, and three practical applications that become possible once you can prove things about a substrate without revealing it.

The test

A Bitcoin UTXO proof-of-reserves computation result was substrated. The word "substrated" here means what it has meant in the previous posts: the computation output was canonically encoded, hashed with SHA3-256, packed into a 58-byte canonical commitment with a version byte, a computation type byte, a 32-byte content hash, an 8-byte millisecond timestamp, and a 16-byte nonce, and then signed under the three-family post-quantum signature bundle (ML-DSA-65, FALCON-512, SLH-DSA-SHA2-128f). The compact receipt — 42 bytes committing to the concatenated signature bundle — was produced. The 32-byte signing message was derived from SHA3-256 over the canonical encoding of the substrate. That signing message is the public artifact. It is what goes on-chain. It is the only thing the verifier sees.

The test then constructed a zero-knowledge constraint system over the 58-byte canonical commitment, treating it as the private witness, with the 32-byte signing message as the sole public input. Six constraints were verified:

(a) Hash binding. SHA3-256 of the 58-byte witness equals the 32-byte public input. This is the fundamental binding constraint. It proves that the prover actually knows a pre-image of the on-chain hash, not just the hash itself. Without this constraint, a prover could claim to know any substrate at all — the hash binding pins the proof to a specific canonical commitment.

(b) Version check. The first byte of the witness equals 0x01. This proves the substrate was minted under the current version of the canonical commitment format. A verifier who trusts version 1 semantics can accept the proof without worrying about legacy format ambiguities.

(c) Type check. The second byte of the witness equals 0x06, which is the computation type for BitcoinUtxo in the substrate's append-only type registry. This proves the substrate attests a Bitcoin UTXO computation specifically, not a biometric match or a fraud score or an AI inference. The computation type is part of the private witness — the verifier learns that the type is BitcoinUtxo from the proof, but does not learn it from the on-chain data.

(d) Content hash match. The 32-byte content hash field (bytes 2 through 33 of the witness) matches the SHA3-256 hash of the claimed proof-of-reserves computation output. This constraint binds the substrate to a specific computation result. A prover who substrated a different computation — say, a fraud score instead of a proof-of-reserves — cannot satisfy this constraint even if they have a valid substrate of the correct type.

(e) Timestamp freshness. The 8-byte timestamp field (bytes 34 through 41 of the witness) represents a millisecond Unix timestamp within the last hour relative to the verification time. This proves the substrate was minted recently. A prover holding a year-old substrate cannot use it to satisfy this constraint. The specific window — one hour — is a parameter of the constraint system and can be adjusted per application.

(f) Nonce liveness. The 16-byte nonce field (bytes 42 through 57 of the witness) is non-zero. This is a liveness constraint. It proves the substrate was produced by a system that generated a fresh random nonce at minting time, not by a degenerate system that zeroed out the nonce field. The nonce value itself remains hidden.

A Fiat-Shamir challenge was derived from a transcript binding the 32-byte public input. The transcript initialization absorbs the public input, the constraint identifiers, and a domain separator for the substrate ZK protocol. The resulting challenge is used to construct a random linear combination of the six constraint evaluations, collapsing them into a single check that the verifier evaluates against the committed values. This is the standard Fiat-Shamir heuristic applied to the substrate's constraint system: it converts what would be an interactive proof (verifier sends random challenges, prover responds) into a non-interactive one (prover derives the challenges deterministically from a transcript that includes the public input).

All six constraints passed. The test establishes the constraint structure — the specific algebraic relations over the witness that a future full STARK proof will enforce. The Fiat-Shamir transcript establishes the challenge derivation. Together, they define the ZK-over-substrate protocol at the constraint level, independent of the specific proof system that will eventually arithmetize and prove them.

What the verifier learns

A verifier who receives this proof and the 32-byte public signing message on-chain learns the following composite statement: "The prover knew a valid BitcoinUtxo substrate binding to a specific computation result, minted within the last hour, with a fresh nonce." The verifier learns this with the soundness guarantee of the constraint system — a cheating prover who does not actually know such a substrate cannot produce a convincing proof except with negligible probability.

The verifier does not learn the nonce. The verifier does not learn the timestamp (beyond the fact that it falls within the freshness window). The verifier does not learn the full content hash (beyond the fact that it matches the claimed computation). The verifier does not learn anything about the computation output itself, only that a substrate binding to it exists and satisfies the six constraints.

This is zero-knowledge in the precise technical sense: the verifier's view of the proof can be simulated without access to the witness. A simulator that knows only the public input and the statement being proved can produce a transcript that is computationally indistinguishable from a real proof transcript. The witness — the 58-byte canonical commitment — is not needed for simulation. This is what distinguishes a ZK proof from a signed commitment. A signed commitment reveals the commitment. A ZK proof over a signed commitment proves the commitment exists without revealing it.

Practical use: private proof-of-reserves

The most immediate application is private proof-of-reserves for cryptocurrency exchanges. The problem is well-known: an exchange that wants to prove it controls N BTC needs to demonstrate control of specific UTXOs, but revealing the UTXO set exposes the exchange's wallet structure, its transaction patterns, and in many cases its cold storage architecture. This is not hypothetical. When exchanges have published proof-of-reserves data, chain analysis firms have immediately mapped the wallet structures, and competitors have used the information for business intelligence. The privacy cost of transparency is real.

The substrate ZK construction resolves this tension. The exchange runs a proof-of-reserves computation internally — summing the values of all UTXOs it controls, verifying control by signing with the corresponding private keys — and substrates the result. The substrate binds the proof-of-reserves computation output (the total balance, the UTXO count, whatever the auditor needs) to a 32-byte on-chain anchor via a three-family post-quantum signature. The ZK proof then proves that the on-chain anchor corresponds to a valid BitcoinUtxo substrate minted within the last hour, binding a computation result that matches the claimed reserves figure.

The auditor sees the 32-byte on-chain anchor. The auditor receives the ZK proof. The auditor verifies the six constraints. The auditor learns: the exchange produced a valid substrate attesting a Bitcoin UTXO computation that matches the claimed reserves, and the substrate was minted recently with a fresh nonce. The auditor does not learn which UTXOs the exchange controls. The auditor does not learn the exchange's wallet structure. The auditor does not learn anything about the exchange's cold storage architecture. The auditor learns the reserves claim is backed by a valid, recent, post-quantum-signed attestation. That is exactly the information the auditor needs and nothing more.

The post-quantum property matters here because proof-of-reserves is a long-duration commitment. An exchange that publishes a proof-of-reserves today expects it to be verifiable for years. If the proof is signed under ECDSA alone, a future quantum computer that breaks ECDSA can forge a historical proof-of-reserves, creating a falsified audit trail. The substrate's three-family signature structure (ML-DSA-65, FALCON-512, SLH-DSA-SHA2-128f) ensures the proof remains unforgeable even against a quantum adversary that breaks one or two of the three underlying hardness assumptions.

Practical use: anonymous credential verification

The second application is anonymous credential verification, specifically for KYC. The problem: a user who has completed KYC verification wants to prove they have been verified without revealing their identity. This is a fundamental tension in compliance. The regulator wants to know that KYC was performed. The user wants to minimize data exposure. The platform wants to satisfy both without becoming a data broker.

The substrate ZK construction addresses this directly. When a user completes KYC, the verification result is substrated with computation type 0x07 (KycVerification in the substrate type registry). The 32-byte content hash in the substrate commits to the full KYC verification output — the user's identity, the verification method, the result, the timestamp. This content hash is the field that would identify the user if revealed.

The ZK proof proves "I have a valid KycVerification substrate" without revealing the content hash. Specifically, the proof establishes: (a) the prover knows a 58-byte witness whose SHA3-256 hash matches a public signing message, (b) the witness has version 0x01, (c) the computation type byte is 0x07, (d) the content hash field is well-formed and non-zero, (e) the timestamp is within the credential's validity window (which might be 90 days instead of one hour, depending on the application), and (f) the nonce is non-zero. The verifier learns that the prover holds a valid, recent KYC substrate. The verifier does not learn the content hash, which means the verifier cannot determine which specific KYC verification the substrate attests, which means the verifier cannot identify the user.

This is stronger than the "I have a credential" proofs in existing self-sovereign identity systems because the substrate's three-family post-quantum signature makes the underlying credential unforgeable even against a quantum adversary. A conventional W3C Verifiable Credential signed under ECDSA can be forged by a quantum computer that breaks the elliptic curve discrete logarithm problem. A substrate-backed credential cannot be forged without simultaneously breaking module lattices, NTRU lattices, and hash pre-image resistance. The ZK layer adds privacy on top of the post-quantum durability.

The composition is clean. The substrate provides the attestation (signed commitment to a computation result). The ZK proof provides the privacy (selective disclosure of properties without revealing the witness). The three-family signature provides the post-quantum durability (unforgeable against any adversary that cannot break all three families simultaneously). These three properties — attestation, privacy, and post-quantum durability — compose without interfering with each other because they operate at different layers of the protocol.

Practical use: supply chain privacy

The third application is supply chain attestation with inspection privacy. The problem: a manufacturer wants to prove that a product passed quality control at a specific point in the supply chain without revealing the inspection details, the inspector's identity, or the specific test results. The downstream buyer wants assurance that inspection occurred. The manufacturer wants to protect its quality control processes as trade secrets.

The substrate construction fits this naturally. At the time of inspection, the quality control system substrates the inspection result. The computation type might be a general-purpose attestation type or a domain-specific supply chain type in the registry. The content hash commits to the full inspection report — the test results, the inspector's credentials, the pass/fail determination, the product serial number, the facility identifier, the calibration records of the testing equipment. All of this is hashed into the 32-byte content hash field of the substrate. The substrate is signed under the three-family post-quantum bundle. The 32-byte signing message is anchored on-chain (or stored in a supply chain ledger, or transmitted to the buyer directly).

The ZK proof then proves: a valid substrate exists, it was minted at the time of inspection (timestamp constraint), it binds to a specific inspection computation (content hash constraint), it was produced by a live system (nonce constraint), and it has the correct version and type. The buyer verifies the proof. The buyer learns that a quality control inspection occurred at the claimed time and produced a result that was substrated. The buyer does not learn the test results. The buyer does not learn the inspector's identity. The buyer does not learn the facility or the equipment calibration records. The substrate proves the attestation occurred. The ZK proof proves the substrate is valid. The combination provides supply chain assurance without supply chain transparency.

This matters in industries where inspection data is genuinely sensitive. Pharmaceutical manufacturers do not want to reveal the exact parameters of their quality control tests because those parameters encode proprietary manufacturing knowledge. Defense contractors do not want to reveal inspection details because those details can reveal capabilities. Food producers do not want to reveal facility-specific test results because a single out-of-range measurement (that still passes) can be taken out of context and used to damage a brand. In all of these cases, the ZK-over-substrate construction provides the assurance that the downstream party needs without the exposure that the upstream party fears.

Why STARK and not SNARK

The substrate uses SHA3-256 as its hash function. The content hash is SHA3-256 of the computation output. The signing message is SHA3-256 of the canonical encoding. The compact receipt's cryptographic commitment is SHA3-256 of the concatenated signature bundle. SHA3-256 is everywhere in the substrate protocol because it is the hash function with the strongest post-quantum security margin — its 256-bit output provides 128-bit security against Grover's algorithm, matching the security target of the NIST Level 1 signature families in the substrate's bundle.

This choice of hash function has a direct consequence for the choice of proof system. STARK proofs are hash-friendly. The STARK proving system is built on the FRI (Fast Reed-Solomon Interactive Oracle Proof) protocol, which uses hash functions as its only cryptographic primitive. When the constraint system involves SHA3-256 evaluations — as it does in constraint (a), the hash binding constraint — a STARK prover can arithmetize the SHA3-256 computation directly, encoding the Keccak permutation as algebraic constraints over a finite field, and prove the hash evaluation as part of the same proof that proves the other five constraints. There is no mismatch between the hash function used in the substrate protocol and the hash function used in the proof system. The STARK proof is hash-native.

SNARK proofs are not hash-friendly in this way. The two most widely deployed SNARK systems — Groth16 and PLONK — are pairing-based. They rely on bilinear pairings over elliptic curves, specifically BN254 or BLS12-381 in most production deployments. These pairings provide the succinctness property that makes SNARK proofs small (128 bytes for Groth16, roughly 500 bytes for PLONK), but they introduce two problems for the substrate use case.

First, the pairing-based structure means that hash evaluations inside the circuit are expensive. SHA3-256 is not "pairing-friendly" — the Keccak permutation involves bitwise operations (XOR, rotation, AND) that translate poorly into the arithmetic constraints of a pairing-based circuit. A SHA3-256 evaluation inside a Groth16 circuit costs tens of thousands of R1CS constraints. Inside a STARK algebraic intermediate representation, the same evaluation is naturally expressible as state transitions of the Keccak sponge, which the STARK's AIR (Algebraic Intermediate Representation) handles natively. The proving cost difference is substantial.

Second, and more fundamentally, pairing-based SNARKs are vulnerable to quantum computers. The security of Groth16 rests on the knowledge-of-exponent assumption over elliptic curves, which is broken by Shor's algorithm. The security of PLONK rests on the discrete logarithm problem over the same curves, which is also broken by Shor's algorithm. A SNARK proof over a substrate would create a quantum vulnerability in the proof layer even though the substrate itself is post-quantum. This is an architectural contradiction. The entire point of the substrate's three-family signature structure is to provide post-quantum durability. Bolting a quantum-vulnerable proof system on top of a post-quantum attestation primitive would undermine the security argument at the proof layer.

A STARK proof has no such vulnerability. STARK security rests on the collision resistance of the hash function used in FRI — in our case, SHA3-256. Collision resistance of SHA3-256 is not broken by Shor's algorithm. It is degraded by Grover's algorithm from 256-bit classical security to 128-bit quantum security, which is still at or above the security target. A STARK proof over a substrate is post-quantum end-to-end: the substrate's three-family signatures are post-quantum, the substrate's hash function is post-quantum, and the proof system is post-quantum. There is no weak link.

This is why H33 uses ZK-STARK exclusively and does not use Groth16. It is not a preference. It is a consequence of the substrate's security requirements. Any proof system that relies on elliptic curve pairings introduces a quantum vulnerability that the substrate was specifically designed to avoid. STARK is the only widely deployed proof system that avoids this vulnerability entirely.

The constraint structure as a foundation

Test 4 establishes the six constraints and the Fiat-Shamir transcript. It does not yet produce a full STARK proof — that requires arithmetizing the constraints into an AIR, committing to the execution trace, running FRI, and producing the final proof bytes. The test establishes the pre-arithmetization layer: the specific algebraic relations that the STARK will prove, verified here by direct evaluation and a Fiat-Shamir challenge rather than by a full FRI proof.

This layering is deliberate. The constraint structure is the semantics of the ZK-over-substrate protocol. It defines what the verifier learns. The STARK machinery is the mechanism that enforces those semantics with cryptographic soundness. By validating the constraint structure independently of the STARK machinery, we can iterate on the semantics (adding new constraints, adjusting freshness windows, supporting new computation types) without re-deriving the arithmetization each time, and we can validate the arithmetization by checking that it enforces the same constraints that the pre-arithmetization test validates.

The full STARK proof is the next step. When it ships, a verifier will receive a STARK proof (a few hundred kilobytes, reducible via recursive composition) alongside the 32-byte on-chain anchor, and will be able to verify the six constraints — or whatever constraint set the application requires — with the full soundness guarantee of the FRI protocol, post-quantum end-to-end, without learning anything about the 58-byte witness beyond what the constraints reveal.

What this means for the substrate as a primitive

The first post in this series described the substrate as a primitive: a small, well-typed cryptographic object with a fixed interface and a security argument, from which many different constructions can be built. Test 4 demonstrates a specific kind of composability that we claimed but had not yet proved: the substrate composes with zero-knowledge proof systems.

The substrate's fixed layout — 1 + 1 + 32 + 8 + 16 bytes, always in the same order, always with the same field semantics — is what makes this composition natural. The ZK constraint system can address specific byte ranges of the witness by position because the positions are fixed. The version byte is always at offset 0. The type byte is always at offset 1. The content hash is always at offsets 2 through 33. The timestamp is always at offsets 34 through 41. The nonce is always at offsets 42 through 57. A variable-length or self-describing format would make constraint authoring harder and constraint verification more expensive. The substrate's fixed layout is a design choice that pays off in ZK composability.

The six constraints in Test 4 are the minimal useful set. They prove validity, type, binding, freshness, and liveness. Application-specific constraints can be added on top: range proofs over the timestamp (proving the substrate was minted within a specific calendar quarter without revealing the exact time), set membership proofs over the computation type (proving the substrate attests one of a specific set of computation types without revealing which one), or content hash relations (proving that two substrates bind to the same computation output without revealing the output). Each of these extensions composes with the base six constraints because they all operate over the same fixed-layout witness.

This is the architectural payoff of building a primitive rather than a protocol construction. A protocol construction solves one problem. A primitive with ZK composability lets anyone build a protocol construction that solves their specific problem, using the substrate as the attestation layer and ZK-STARK as the privacy layer, post-quantum end-to-end, without having to design the cryptographic plumbing themselves.

The next post in the series will cover Test 5: substrate receipt chain verification, where the tamper-evident chain of receipts is verified end-to-end and the implications for long-lived audit trails are explored. Patent pending — H33 substrate Claims 124-125.

Build with the H33 Substrate

The substrate crate is available for integration. Every H33 API call now returns a substrate attestation.

Get API Key Read the Docs