Portable Artifact Architecture
The architectural specification for evidence that survives software, cloud, chain, and vendor change.
This page documents the architectural decisions underlying H33 portable artifacts. It is intended for engineers building integrations against the artifact format, architects evaluating the pattern for institutional adoption, and standards reviewers assessing the artifact's suitability for regulatory or industry codification.
Motivation
Institutional evidence is increasingly required to survive longer than the systems that produced it. Federal retention windows of 25 years to permanent; healthcare retention beyond patient discharge; insurance retention beyond policy terms. Across these windows, the technology stack evolves repeatedly. Evidence formats that depend on stable software, stable schemas, or stable cryptographic primitives do not survive the retention windows. The portable artifact is designed for the worst-case retention window. It assumes the originating system has retired, the originating vendor has ceased to exist, the cloud provider hosting it has been replaced, and the cryptographic primitives have evolved. Under these assumptions, the artifact must still be verifiable.
Design principles
Self-containment. The artifact carries everything required for verification. Canonicalization. The artifact has a single canonical form. Two parties producing artifacts from the same inputs produce byte-identical output. Algorithmic redundancy. Cryptographic primitives can fail. The artifact carries multiple independent signatures from independent algorithm families. Schema evolution. The artifact's structure will evolve as use cases evolve. The artifact carries a schema version. Selective disclosure. Some artifact content is sensitive. The artifact supports redaction-to-digest while preserving structural verifiability. Vendor independence. Verification does not depend on the originating vendor's continued existence.
Schema
The portable artifact is a JSON document with a defined schema. The top-level structure: SearchExportBundle contains search_result, optional EC objects (authority_bind, policy_bind, pipeline_dag, corpus_bind, model_fingerprint, evidence_attestation, result_citation_bind, calibrated_abstention), sidecars, optional anchor, signatures, and ves_version. Each evidence control object carries an ec_kind discriminator, content-specific fields, and an optional sidecar_uri. The sidecars field carries content the EC objects reference by digest. The anchor field is optional. The signatures field carries three independent signatures.
Canonical serialization
Canonical JSON serialization is the foundation of verifiability. The same artifact must serialize to byte-identical output across implementations. The canonical form follows these rules: UTF-8 encoding throughout; sorted object keys (lexical sort by code point); no whitespace except newline; numbers in shortest unambiguous representation; strings without escape sequences beyond JSON's requirements; NaN, Infinity, and -Infinity disallowed; Unicode normalization to NFC. Two implementations following these rules produce identical output. Conformance test suites validate implementations.
Three-family signature design
The artifact's integrity is anchored by three independent signatures, each from a different post-quantum algorithm family. ML-DSA-65 (FIPS 204) rests on Module Learning With Errors. Public key ~2 KB, signature ~3.3 KB. FALCON-512 (anticipated FIPS 206) rests on NTRU lattice with different geometric structure. Public key ~900 bytes, signature ~660 bytes. SLH-DSA-128f (FIPS 205) rests only on hash function properties. Public key ~32 bytes, signature ~17 KB. The combined signature footprint is approximately 21 KB. The verification policy is configurable per deployment.
Verification protocol
The verifier accepts an artifact and produces a verdict. Stage 1: Parse. The artifact must parse as canonical JSON. Stage 2: Schema validation. The artifact's structure must satisfy the schema version's requirements. Stage 3: Signature verification. Each signature is verified against the artifact's canonical content. Stage 4: EC object verification. Each present EC object is structurally validated. Stage 5: Sidecar verification. Each sidecar's digest is computed and compared. Stage 6: Anchor verification (if present). The anchor's chain transaction is queried. Stage 7: Aggregate verdict. The verifier returns PASS, PASS_WITH_WARNING, or FAIL.
Common questions
Can I implement my own artifact generator?
Yes. The schema and canonical serialization specification are published. Conformance test suites validate implementations.
What if a post-quantum algorithm fails?
The three-family design means a single-algorithm break does not invalidate the artifact. The verification policy can accept a quorum of signatures.
How does schema versioning work?
Each artifact declares its schema version. The verifier's behavior for each version is frozen at release. Older artifacts remain verifiable under newer verifiers.
What about replay verification?
Replay verification regenerates the artifact from the underlying inputs and confirms byte-identical match against the stored artifact.
Is the artifact format standardized?
The schema is published as a de facto standard. Submission to standards bodies (IETF, NIST, ISO) is a future objective.
Related: Portable Artifact · Independent Verification Model · Trust Model · H33-74 Substrate