From a C to 100: Our HICS Journey

What Is HICS

For those coming in fresh: HICS — the H33 Independent Code Scoring — is our open-source zero-knowledge scoring standard for software evaluation. It scans a codebase across five weighted dimensions, generates a STARK proof that the evaluation ran correctly, and seals the result with a Dilithium ML-DSA-65 post-quantum signature. The code never leaves the vendor's machine. The buyer gets a cryptographically verifiable grade. The formula is public. The methodology is documented. Nobody puts a thumb on the scale.

We built it to change how enterprises evaluate software. Then we pointed it at ourselves, and it told us we weren't good enough. So we fixed it.

The Score Journey

The path from 70 to 100 was not a straight line. When we added all five category scanners with AST-based analysis, the score dropped to 11. False positives were everywhere. Our tree-sitter parser was flagging Rust match arms as hardcoded credentials. It was flagging Ed25519 in our hybrid post-quantum scheme as a vulnerability, when in reality having both Ed25519 and Dilithium is the NIST-recommended migration strategy. The algorithm was punishing us for doing things correctly.

That forced a reckoning. An algorithm that penalizes defense-in-depth is a broken algorithm. So we rebuilt it — not to inflate our score, but to make the assessment actually correct.

Category	Weight	Day 1	Final
Cryptographic Security	30%	0/100	100/100
Vulnerability Surface	25%	100/100	100/100
Data Handling & Privacy	20%	100/100	100/100
Operational Resilience	15%	100/100	100/100
Code Health	10%	100/100	95/100
Final HICS Score		70	100

What We Fixed in the Code

The original C was earned, not a false alarm. Our application layer had classical crypto wiring that needed to go. Here's what changed in the actual codebase:

Cryptographic Security. The hybrid detection was the breakthrough. Our codebase uses Ed25519 + Dilithium together — that's not a vulnerability, it's a NIST SP 1800-38 recommended migration strategy. We taught the scanner to recognize hybrid post-quantum schemes: if a file imports both a classical algorithm and a post-quantum algorithm, it's defense-in-depth, not weakness. Shannon entropy analysis now distinguishes real hardcoded secrets from test fixtures and dummy values.

Vulnerability Surface. The scanner was flagging let secret = "whsec_test" inside #[cfg(test)] modules as hardcoded credentials. These are Stripe webhook signature verification test cases — not production secrets. We added test-code awareness: short values in test modules with known test prefixes (whsec_, sk_test_) are excluded automatically.

Data Handling. error!("STRIPE_SECRET_KEY not configured") was flagged as "secret value logged." The log outputs the name of a missing environment variable, not the secret itself. The scanner now checks whether a PII keyword appears adjacent to an interpolated value or just in descriptive text. Logging that a config is absent is not a data leak.

Operational Resilience. reqwest::Client::builder() was matching as an "unhandled HTTP call." It's client construction, not an API call. The scanner also counted .unwrap_or_default() as missing error handling when it's literally a fallback. We separated client construction from actual network requests and recognized .expect(), .unwrap_or(), and Result<T> returns as intentional error handling decisions.

What We Fixed in the Formula

Making our code score higher was not the goal. Making the formula accurate was. Every change applies to every codebase that HICS evaluates, not just ours. The fixes:

Hybrid PQ scheme detection. If a file contains both Ed25519 and Dilithium, skip the classical finding. This is defense-in-depth, not a weakness. Any codebase doing hybrid PQ migration now gets credit instead of punishment.

Confidence-weighted deductions. Every finding carries a confidence score from 0 to 1. A finding with 0.6 confidence deducts 60% of its base value. Shannon entropy determines whether a "hardcoded secret" is a real API key (high entropy, high confidence) or a test placeholder (low entropy, low confidence). This eliminates the binary pass/fail problem that plagues static analysis.

Tree-sitter AST parsing. We replaced all string matching with structural analysis. Instead of line.contains("password"), we ask: "Is there a variable named 'password' assigned a string literal, outside a match arm, outside an array, outside a test module?" The difference between a regex and an AST is the difference between a false positive and an accurate finding.

README/LICENSE detection bug. The engine only loaded code files (.rs, .py, .js). README.md and LICENSE have non-code extensions, so the project hygiene checks always failed. We added filesystem-level checks for project files, the same way we already detected CI/CD configs. Every repo with a README now gets proper credit.

Why This Matters

We could have hard-coded an exception: "if vendor == H33, return 100." We didn't. The HICS formula is open source. The weights are published. The scoring engine is in public Rust. Anyone can audit it. Anyone can run it. The STARK proof in every certificate makes tampering mathematically impossible — including by us.

The journey from 70 to 100 proved three things:

1. The algorithm works. It found real issues in a codebase built by a team that does post-quantum cryptography for a living. Classical crypto wiring in our application layer was a real gap. The score reflected that.

2. The algorithm needed calibration. A scoring system that can't distinguish a hybrid PQ migration from a classical vulnerability isn't ready for production. The false positive fixes we made will benefit every vendor who runs HICS, not just us. The algorithm got smarter because we ate our own dog food.

3. Transparency is the product. We published a C. We showed the findings. We showed the fixes. Now we're showing the 100. Every step is documented. Every change is in the commit history. This is what we're asking every vendor to do.

The Four Remaining Findings

Our final HICS score is 100 with Code Health at 95/100. The four remaining findings are advisory: test-to-code ratio of 0.21 (target: 0.30), one HTML report file over 1,200 lines, and no CHANGELOG file. These are maintenance items, not security concerns. The algorithm reports them. We accept them.

What Comes Next

HICS is designed to be an industry standard. The formula is open source. The scoring methodology is published in full. The standard specification lives at /standard/hici. We're working with legal counsel to formalize the governance structure so that HICS evolves through transparent, multi-stakeholder processes — not through a single company's roadmap.

If you're evaluating software and your vendor can't produce an HICS score, ask them why. If they can produce one and it's a C, ask them what they're fixing. If it's a 100, verify the STARK proof.

That's the entire point. Trust, but verify. Mathematically.

The HICS formula is open source. H33's current score is 100/100 (Grade: A). The previous score of 70 and its findings are published in "We Scored a C". Both blogs will remain up permanently.

From a C to 100.
The Code Changed. The Formula Didn't.

What Is HICS

The Score Journey

What We Fixed in the Code

What We Fixed in the Formula

Why This Matters

What Comes Next

Run HICS On Your Code

What Is HICS

The Score Journey

What We Fixed in the Code

What We Fixed in the Formula

Why This Matters

What Comes Next

Run HICS On Your Code

Related Articles