Every computation type deserves a byte
This post is about a specific piece of protocol design that I think is underrated: the single-byte append-only registry for computation type domain separators. The substrate uses one of these, and the decision to include it in the substrate's wire format turne
This post is about a specific piece of protocol design that I think is underrated: the single-byte append-only registry for computation type domain separators. The substrate uses one of these, and the decision to include it in the substrate's wire format turned out to be one of the most consequential design choices we made. This post is about why.
The short version: a one-byte field in a cryptographic wire format that identifies which kind of computation the attestation is about, governed by an append-only public registry, creates a stable global namespace for attestable computation domains, and the stable namespace is what turns a specific cryptographic construction into a protocol. Without the namespace, the construction is a one-off tool. With the namespace, the construction is a vocabulary that many systems can share.
The longer version is below, and it covers: why a one-byte field rather than a variable-length field, why append-only rather than mutable, why public rather than private, what it costs to be disciplined about the registry, what the historical precedents are, and why this specific design pattern has been undervalued in recent cryptographic protocol work.
Why a byte
The substrate's computation type field is exactly one byte. 256 possible values. Most of the values are currently unused. The choice to use one byte rather than two, or four, or a variable-length field, is deliberate, and the reasoning is worth walking through because it is a specific kind of engineering judgment that gets made frequently in protocol design.
One byte is enough. The substrate's target is a few tens of computation types, maybe reaching a hundred over the next several years. There are not actually thousands of distinct attestation domains that need independent namespaces. Biometric authentication, fraud scoring, financial transaction attestation, AI inference provenance, media capture authenticity, legal evidence chains, document archival, regulatory filing, identity verification, compliance audit events — these are the kinds of domains we expect to allocate computation types for, and they are dozens of items, not thousands. A one-byte field accommodates 256 distinct values, which is more than enough headroom for the foreseeable addition rate.
Two bytes would be overkill. A two-byte field gives 65,536 possible values, which is four orders of magnitude more than we expect to need. The extra byte per substrate adds up at scale — a billion substrates per year is a billion extra bytes of on-chain or off-chain persistent state, for the ability to identify computation types that do not currently exist and will likely never exist. The cost-benefit analysis comes out in favor of the smaller field.
Variable-length encoding would complicate the wire format. A variable-length field (say, a length-prefixed byte sequence or a variable-length integer) would allow for arbitrarily many computation types, at the cost of making the wire format slightly more complex to parse and analyze. For a cryptographic primitive where the wire format is read and written many times per second on every deployment, the parsing simplicity of a fixed-width field is worth more than the headroom of a variable-length field. Every additional complexity in the wire format is a potential place where implementations disagree and security bugs hide.
One byte is readable by humans. A single hexadecimal pair (e.g., 0x06) is easy to write down, easy to remember, and easy to discuss. A two-byte value (e.g., 0x0006) is slightly harder, and a variable-length value is hard to discuss at all. When we are documenting the substrate registry, explaining which values correspond to which domains, and having conversations about the protocol, the compactness of the single-byte representation is an ergonomics win that compounds across every conversation.
One byte is the right size for cache alignment. The substrate's overall wire format is 58 bytes, chosen to fit into a cache line along with metadata. Adding a second byte to the computation type field would shift other fields, and the shift would have consequences for cache behavior that are not obvious but are real at high concurrency. Keeping the field at one byte keeps the overall layout clean.
None of these reasons individually is decisive. The cumulative case for one byte is that each of them pulls in the same direction, and the combined pull is enough to make one byte the clear answer. The process of landing on a specific wire-format width is often like this — many small considerations that each have some weight, rather than a single overriding argument.
Why append-only
The registry that governs which computation type bytes have been assigned is append-only. Once a byte is assigned a meaning, the meaning is frozen — the byte cannot be reassigned, the computation type cannot be renamed, and the semantics of substrates minted under that type cannot change retroactively.
The reason for append-only is the same reason network port numbers are append-only, language version numbers are append-only, and file format magic numbers are append-only: substrates minted in the past must remain valid in the future, and validity depends on the meaning of the computation type byte remaining stable over time.
Here is the specific scenario that append-only prevents. Imagine a substrate minted in 2026 with computation type 0x06 BitcoinUtxo. The substrate's semantics depend on what 0x06 meant at the time of minting — specifically, that the substrate attests a Bitcoin UTXO operation. A verifier in 2036, checking the substrate, needs to understand 0x06 in the same way the minter did. If the registry allowed 0x06 to be reassigned (say, to FedNowPayment in 2030), the verifier in 2036 would see a substrate that apparently attests a FedNow payment but was originally produced as a Bitcoin UTXO attestation. The semantic mismatch breaks the attestation's value.
Append-only prevents this by guaranteeing that 0x06 BitcoinUtxo means the same thing forever. A verifier in 2036, or 2046, or 2056, can look up 0x06 in the registry and find BitcoinUtxo, the same meaning it had in 2026. The substrate's semantics are stable across the registry's history.
This is a strong commitment. It means we cannot clean up the registry later if we decide some computation types were badly named or badly scoped. If 0x01 was assigned to "BiometricAuth" in 2025 and we later decide the name is too generic, we cannot rename it — we have to live with the original name forever. New, more specific biometric authentication types can be added under different bytes (say, 0x20 FacialBiometric, 0x21 FingerprintBiometric), but 0x01 BiometricAuth stays as it was. The registry becomes a historical record of decisions, not a malleable list we maintain.
The cost of append-only is the loss of flexibility to revise mistakes. The benefit is that downstream systems can rely on registry values being stable across time, and the reliance is what enables long-term substrate verification. Without the stability, the substrate's persistence property is much weaker: "your substrates will remain verifiable, but the meaning of the computation type byte might change, so you cannot actually rely on what the substrates say." With the stability, the persistence is complete.
This is the same trade-off that IANA makes for IP protocol numbers, and for the same reasons. IANA's protocol number registry has hundreds of entries, some of them for protocols that have been obsolete for decades (like 0x19 LEAF-1, a 1980s link-layer protocol that no one uses anymore). IANA does not remove these entries because an obsolete entry is still a useful historical record, and removing it would break any documentation or historical data that referenced the obsolete protocol. The registry is a cumulative list of everything that has ever been assigned, not a filtered list of what is currently relevant. The substrate's registry works the same way and for the same reason.
Why public
The registry is public. Anyone can view the current list of assigned values, view the proposed additions, view the rationale for each assignment, and audit the registry's administrative process. There is no private or undocumented portion of the registry, and there are no "enterprise-only" values that are not visible to everyone.
The reason for public is that the registry is a shared namespace, and shared namespaces only work if everyone agrees on what the values mean. A registry that was partially private — where some values were visible only to specific customers — would create ambiguity about what the values mean for other participants. A substrate with a private computation type would be interpretable only by the private customer, which defeats the purpose of the registry as a stable global namespace.
This is the same reason DNS root zones are public. ICANN publishes the list of TLDs, the list of root name servers, and the DNSSEC keys for the root zone. Anyone can verify that ".com" points to the correct registry, or that ".uk" is delegated to Nominet. A DNS root zone that was partially private would fragment the DNS into incompatible subregions — a domain name that resolved in one region might not resolve in another, or might resolve to different values, and the value of DNS as a global naming system would collapse.
The substrate's registry is smaller and lower-stakes than DNS, but the principle is the same. Publicity is what turns a list into a namespace. Private registries become per-customer dialects that do not interoperate; public registries become shared vocabularies that do.
The administrative process for adding values to the registry is also public. A new computation type is proposed with a rationale, the proposal is reviewed (currently by us, eventually by a broader community as the substrate ecosystem matures), and if accepted the value is assigned and the rationale is published alongside it. This public process is what lets the community check that the registry is being administered fairly and that new values are being added for good reasons. If we started assigning values arbitrarily, or for reasons that did not make sense, the public process would expose the problem and the community could push back.
Public review is not decorative; it is the mechanism that keeps the registry honest. A registry whose administration was opaque would accumulate decisions that do not stand up to scrutiny, and the accumulation would eventually undermine the registry's credibility. A public registry invites scrutiny on every decision, which creates ongoing pressure to make defensible decisions. The pressure is what makes the registry a credible shared namespace rather than a private list we keep.
The discipline the registry demands
Running an append-only public registry is more disciplined than it might appear. Every decision to add a new value is a permanent decision, and the cumulative quality of the registry depends on the quality of each individual decision. Careless assignments accumulate as historical debt that cannot be cleaned up, and the debt compounds as new values are added on top of the old ones.
Here is the specific discipline we try to maintain:
New values have a clear, specific meaning. A computation type like "Generic" or "Other" is tempting because it provides a bucket for anything that does not fit elsewhere, but generic buckets tend to accumulate substrates that should have had more specific types, and the lack of specificity makes the substrates less useful. We have one such value (0xFF GenericFhe) for truly unclassifiable FHE computations, but we try to avoid adding similar catch-alls and to push for specific types whenever possible.
New values are justified by real use cases. We do not add computation types speculatively. A value is added when we have a specific customer or a specific application that needs it, not when we imagine that someone might need it in the future. Speculative additions clutter the registry and create the impression that the registry is being administered sloppily.
New values are documented with enough detail for third-party implementers. The registry entry for each computation type includes a description of what the type attests, what the canonical data format looks like, and any specific constraints on how the substrate should be used. A third party reading the registry should be able to understand the type well enough to implement against it without needing to ask us questions. If the documentation is sparse, the type is not really stable — it is just a placeholder for a future spec.
Values are assigned without vendor-specific naming. We try to avoid naming types after specific vendors or specific products, because vendor-specific names encode today's commercial landscape into a historical record that should be vendor-neutral. A computation type called "AcmeCorpFraudScore" would be less useful than a type called "FraudScore" that Acme Corp (and others) can use. The former ties the type to a specific commercial relationship; the latter makes the type a shared primitive.
The reserved range is preserved. Specific byte values are reserved for specific purposes — the 0xFF catch-all, the assigned values in the 0x01-0x0F range, and the numeric ranges we plan to use for future expansion. Reserving ranges is a soft commitment, but it is a commitment we keep because violating it would break future expansion plans.
Obsolescence is marked, not removed. If a computation type's original use case goes away (say, a specific product is discontinued), the registry entry is marked "obsolete but retained" rather than deleted. The substrates minted under the obsolete type remain interpretable, even if no new substrates are being produced. This is the append-only property in practice.
Maintaining this discipline takes real effort, and I want to be honest that we do not always get it right the first time. The registry is small enough that we can still discuss every addition in detail before committing to it, but as the registry grows the discipline will become harder to maintain, and we will have to invest in tools and process to keep it from drifting. This is a known cost of running an append-only public registry, and it is worth paying because the alternative — a registry that is not append-only, or not public — loses the properties that make it valuable.
Historical precedents
The append-only public byte registry pattern is not new. It has been successful in several domains, and I want to name a few of the successful examples so that the pattern is situated in context.
IANA protocol numbers. The Internet Assigned Numbers Authority (IANA) maintains a registry of 8-bit protocol numbers that appear in the IP header. Protocol 0x01 is ICMP, 0x06 is TCP, 0x11 is UDP, and so on. The registry has been maintained continuously since the 1970s, is append-only (obsolete protocols are not removed), is public, and is the reason the internet's networking stack has a stable vocabulary for "what protocol is this packet carrying." The design choice of a single byte was made in the context of the IPv4 header's 8-bit protocol field, and it has proven adequate for five decades of internet evolution. The substrate's one-byte computation type registry is directly inspired by IANA's pattern.
DNS resource record types. DNS has an append-only public registry of resource record types — A, AAAA, MX, TXT, CNAME, NS, and so on — maintained by IANA. Each type has a stable meaning and a documented format. The registry is public (anyone can look up the current list), append-only (obsolete types like MD are marked but not removed), and has been expanded many times over the years to add new types (DNSSEC record types, HTTPS/SVCB records, and so on). The registry pattern has supported DNS's growth from a simple host-name system to a general-purpose key-value store with specific type semantics.
Unicode code points. The Unicode Consortium maintains a public append-only registry of code points that map integers to characters. Once a code point is assigned to a character, the assignment is frozen — the code point cannot be reassigned, and the character cannot be changed. The registry has grown to over 150,000 code points covering every script and symbol system in common use, and it is administered through a public process that involves community input on new proposals. The success of Unicode as a universal text encoding rests directly on the append-only registry pattern.
HTTP status codes. RFC 7231 and subsequent RFCs specify the set of HTTP status codes (200, 404, 500, 301, etc.) and IANA maintains a registry of assigned codes. The codes are three-digit, not one-byte, but the pattern is the same — each code has a specific meaning, the meanings are stable over time, new codes are added through a public process, and obsolete codes are retained for historical compatibility. HTTP's interoperability across implementations depends on the status code registry being stable and public.
File format magic numbers. Many file formats use an append-only public registry of "magic numbers" — specific byte sequences at the beginning of a file that identify the format. PNG files start with 89 50 4E 47 0D 0A 1A 0A, PDFs start with %PDF-, ZIP files start with PK, and so on. Each magic number is unique to its format, and the registry of magic numbers is public (the List of File Signatures on Wikipedia is one source of truth, and there are several others). A file with an unregistered magic number is unidentified, and the tooling ecosystem relies on the registry being stable.
SSH algorithm names. The IETF maintains a registry of SSH protocol parameter names — key exchange methods, public key algorithms, encryption algorithms, MAC algorithms, compression algorithms. Each name is an ASCII string, not a byte, but the registry pattern is the same. Once a name is registered, it is frozen, and new names can be proposed through a public process. The success of SSH as an interoperable protocol depends on the registry being stable.
In each of these examples, the public append-only registry is a load-bearing piece of protocol infrastructure, and the protocol's interoperability across implementations depends on the registry being maintained well. The substrate's one-byte computation type registry is the same pattern at a smaller scale. It is not novel, and it does not need to be novel. It just needs to work, and the historical precedents tell us that the pattern works when the registry is administered with discipline.
Why this pattern is underrated in recent protocol design
Here is my observation: the append-only public registry pattern has been underused in recent cryptographic protocol design, and I think the reason is that recent protocols have optimized for flexibility rather than stability.
Many recent cryptographic protocols have favored flexible, self-describing wire formats (JSON-LD, CBOR with extensible tags, protobuf with optional fields) over fixed-width byte-assigned formats. Self-describing formats are easier to extend in the short term because you can add new fields without coordinating with a central registry, and they feel modern compared to the older byte-based approach. But self-describing formats come with their own costs: more complex parsing, larger wire footprint, ambiguity about which fields are required versus optional, and (most importantly) no stable global namespace for the values being described.
The lack of a stable namespace is the specific cost I want to flag. In a self-describing format where any party can add new fields, there is no stable vocabulary that all parties share. Two systems that want to interoperate have to either coordinate on a specific schema (which recreates the registry pattern, but as ad-hoc coordination rather than a formal registry) or tolerate ambiguity about what each field means. The coordination cost is significant, and the ambiguity cost is even higher.
Compare this to a fixed-width byte-assigned format with an append-only public registry. Two systems that want to interoperate just have to read the registry and agree on what each byte means. The coordination cost is zero (beyond publishing the registry), and there is no ambiguity because the registry is the source of truth. A system that encounters a substrate with an unknown computation type byte knows exactly what to do: look up the byte in the public registry, find the documentation, and implement against it.
This is a specific trade-off, and for the substrate's use case, the fixed-width registered-byte approach is the right trade-off. Interoperability matters more than flexibility, because the substrate is meant to be a primitive that many systems implement, not a format that one system owns. A primitive that every system can implement to the same specification is more valuable than a format that is easy to extend but hard to agree on.
I am not claiming that every cryptographic protocol should use byte-registered fields. For protocols where the set of possible values is genuinely unbounded or evolves rapidly, self-describing formats are a better choice. What I am claiming is that for the specific pattern the substrate targets — a small, slow-growing set of well-defined domains where cross-implementation interoperability is essential — the byte-registered approach is underused relative to its merits. Protocol designers in 2026 should at least consider it as a design choice, and the consideration should not be dismissed on the grounds that byte registries are "old-fashioned."
The specific substrate registry, in summary
Here is the current state of the substrate's computation type registry, for context. The full list is published and maintained on our website, and this is a representative snapshot.
0x01BiometricAuth — FHE biometric match decisions0x02FraudScore — FHE fraud scoring output0x03FedNowPayment — FedNow ISO 20022 payment attestation0x04SolanaAttestation — Solana transaction attestation0x05HatsGovernance — HATS AI trust standard governance proof0x06BitcoinUtxo — Bitcoin UTXO attestation (the flagship Bitcoin-native type)0x07KycVerification — Zero-knowledge KYC verification0x08ShareComputation — Cross-institution MPC computation0x09ArchiveSign — Document archival signing0x0AMedVaultPhi — Medical PHI operation0x0BVaultKeyOp — Secret management operation0x0CApiResponse — HTTP API response attestation0x0DAiInference — Large-language-model inference provenance0x0EMediaCapture — Capture-time media substrate0x0FLegalEvidence — Legal chain-of-custody event0xFFGenericFhe — Generic FHE computation catch-all
This is 16 assigned values out of 256. The assignments cover the domains we have customer or community interest in today. The unassigned values (0x10-0xFE, excluding 0xFF) are available for future assignment.
The registry has several planned future additions: Lightning Network channel attestation, Taproot script path attestation, Ordinals inscription provenance, BitVM execution traces, RGB client-side validation state, and a few others that we are working through with specific customers. Each of these will be assigned a specific byte when the proposal is finalized, and the assignment will be published with the rationale.
Closing
The single-byte append-only public computation type registry is a small piece of the substrate's wire format, but it does a surprising amount of work. It creates a stable global namespace for attestable computation domains. It supports long-term interoperability between independent implementations. It keeps the wire format compact and cache-friendly. It inherits from a well-understood design pattern with five decades of successful track record in adjacent protocols. And it requires ongoing administrative discipline to maintain, which is a cost we pay in exchange for the namespace's stability.
For protocol designers who are considering similar patterns in their own work, the substrate's registry is one example of how to structure it. The constraints (append-only, public, byte-wide, disciplined administration) are not unique to the substrate, and the pattern is adaptable to other protocols that need a similar kind of stable vocabulary. Not every protocol needs this, but for the ones that do, the pattern is worth the cost.
For readers who came to this post looking for something more technically dramatic than "we reserved a byte for each domain," I hope the specific defense of the pattern is interesting in its own right. Protocol design has more in common with administrative discipline than cryptographers sometimes admit, and the registry pattern is a small example of how administrative discipline can compound into long-term protocol health.
The next post in this series is the regression-transparency post, about the gap between the throughput number we published two weeks ago and the throughput number we are reporting today, and what we are doing about the gap. See you there.
Build with the H33 Substrate
The substrate crate is available for integration. Every H33 API call now returns a substrate attestation.
Get API Key Read the Docs