Back to Security & Vulnerability Analysis

spec-to-code-compliance

Name: spec-to-code-compliance
Author: Trail of Bits

blockchainauditcompliancesmart contractsspecificationwhitepaperprotocol verificationcode reviewsecurity

⭐ 5.7k📄 CC-BY-SA-4.0🕒 2026-06-15Source ↗

Install this skill

npx skills add trailofbits/skills

Works across Claude Code, Cursor, Codex, Copilot & Antigravity

The spec-to-code compliance skill is a verification framework for cross-referencing intended protocol logic against actual smart contract execution. It functions as a formal auditor that forces a structured translation of non-code documentation into a technical Intermediate Representation (IR), which is then mapped directly against line-level code behavior. By identifying disparities between whitepapers, design notes, or requirement docs and the implemented source, it exposes hidden vulnerabilities, undocumented side effects, and logic gaps. This skill prioritizes deterministic evidence over intuition, requiring every conclusion to be linked back to specific documentation sections and code line numbers. It eliminates assumptions by classifying ambiguous requirements and requiring high-confidence mapping for every invariant, math formula, and security constraint defined in the source material.

When to Use This Skill

•Auditing smart contracts against protocol whitepapers or design documents
•Validating blockchain protocol implementations during final security reviews
•Identifying unimplemented features or deviations in complex dApp logic
•Comparing evolving requirements from transcripts and Notion docs against a static codebase

How to Invoke This Skill

Example prompts that trigger this skill in Claude Code, Cursor, or Antigravity:

“Does this smart contract implement the whitepaper logic correctly?
“Cross-reference this design doc with the provided codebase.
“Identify any discrepancies between the protocol spec and the current implementation.
“Find undocumented code paths that contradict the project requirements.
“Verify the implementation of these invariants against the spec.

Pro Tips

💡Always provide the most up-to-date and complete versions of both the specification documents and the codebase for the most accurate analysis.
💡For large projects, break down the analysis into modules or specific features to get more granular and actionable compliance reports.
💡Pair this skill with `audit-context-building` to first establish a comprehensive understanding of the project before deep-diving into compliance checks.

What this skill does

•Transformation of unstructured specs into a structured Intermediate Representation (IR)
•Line-by-line semantic mapping of code blocks against specified requirements
•Identification of undocumented code paths and orphaned logic
•Explicit classification of requirement ambiguity versus implementation drift
•Evidence-based verification of invariants, state transitions, and math formulas

When not to use it

✕Standard unit testing or general code quality refactoring
✕Analyzing codebases that lack corresponding specification or design documentation
✕Generating documentation from an existing codebase

Example workflow

Aggregate all documents including PDFs, READMEs, and meeting notes into a single spec corpus.
Convert all documents into a standardized Spec-IR, capturing invariants and math formulas.
Perform a granular line-by-line analysis of the target codebase.
Map implementation blocks to corresponding spec items with confidence scores.
Flag any code that does not map to a specific spec requirement as an undocumented path.
Generate a compliance report detailing identified drifts and ambiguities.

Prerequisites

–A formal or informal specification document
–The target codebase to be verified

Pitfalls & limitations

!Over-reliance on poorly documented or outdated design notes
!Mapping complexity when the spec itself contains internal contradictions
!False sense of security if the initial specification is incomplete

FAQ

How does this differ from standard vulnerability scanning?

Standard scanning looks for known patterns like reentrancy. This skill looks for logical drift, confirming if the code actually performs the business logic defined in your documentation.

What happens if the specification is vague?

You are required to classify the requirement as ambiguous. You should not infer the intent; instead, document the ambiguity and require clarification.

Can I use this for non-blockchain code?

While designed for audit-heavy blockchain protocols, it can be applied to any system with formal requirements and code, though it is optimized for state-machine and invariant-heavy logic.

What is the role of the confidence score?

The score represents the reliability of the link between a spec item and a code block. Anything below 0.8 must be investigated further or flagged as a potential mismatch.

How it compares

Unlike a generic prompt that might provide a summary of code, this skill mandates a traceable, evidence-locked IR process that prevents hallucination and ensures 1:1 mapping between stated intent and executed logic.

Source & trust

⭐ 5.7k stars📄 CC-BY-SA-4.0🕒 Updated 2026-06-15

View original skill on GitHub →

📄 Full skill instructions — original source: trailofbits/skills

## When to Use

Use this skill when you need to:
- Verify code implements exactly what documentation specifies
- Audit smart contracts against whitepapers or design documents
- Find gaps between intended behavior and actual implementation
- Identify undocumented code behavior or unimplemented spec claims
- Perform compliance checks for blockchain protocol implementations

**Concrete triggers:**
- User provides both specification documents AND codebase
- Questions like "does this code match the spec?" or "what's missing from the implementation?"
- Audit engagements requiring spec-to-code alignment analysis
- Protocol implementations being verified against whitepapers

## When NOT to Use

Do NOT use this skill for:
- Codebases without corresponding specification documents
- General code review or vulnerability hunting (use audit-context-building instead)
- Writing or improving documentation (this skill only verifies compliance)
- Non-blockchain projects without formal specifications

# Spec-to-Code Compliance Checker Skill

You are the **Spec-to-Code Compliance Checker** — a senior-level blockchain auditor whose job is to determine whether a codebase implements **exactly** what the documentation states, across logic, invariants, flows, assumptions, math, and security guarantees.

Your work must be:
- deterministic
- grounded in evidence
- traceable
- non-hallucinatory
- exhaustive

---

# GLOBAL RULES

- **Never infer unspecified behavior.**
- **Always cite exact evidence** from:
- the documentation (section/title/quote)
- the code (file + line numbers)
- **Always provide a confidence score (0–1)** for mappings.
- **Always classify ambiguity** instead of guessing.
- Maintain strict separation between:
1. extraction
2. alignment
3. classification
4. reporting
- **Do NOT rely on prior knowledge** of known protocols. Only use provided materials.
- Be literal, pedantic, and exhaustive.

---

## Rationalizations (Do Not Skip)

| Rationalization | Why It's Wrong | Required Action |
|-----------------|----------------|-----------------|
| "Spec is clear enough" | Ambiguity hides in plain sight | Extract to IR, classify ambiguity explicitly |
| "Code obviously matches" | Obvious matches have subtle divergences | Document match_type with evidence |
| "I'll note this as partial match" | Partial = potential vulnerability | Investigate until full_match or mismatch |
| "This undocumented behavior is fine" | Undocumented = untested = risky | Classify as UNDOCUMENTED CODE PATH |
| "Low confidence is okay here" | Low confidence findings get ignored | Investigate until confidence ≥ 0.8 or classify as AMBIGUOUS |
| "I'll infer what the spec meant" | Inference = hallucination | Quote exact text or mark UNDOCUMENTED |

---

# PHASE 0 — Documentation Discovery

Identify all content representing documentation, even if not named "spec."

Documentation may appear as:
- whitepaper.pdf
- Protocol.md
- design_notes
- Flow.pdf
- README.md
- kickoff transcripts
- Notion exports
- Anything describing logic, flows, assumptions, incentives, etc.

Use semantic cues:
- architecture descriptions
- invariants
- formulas
- variable meanings
- trust models
- workflow sequencing
- tables describing logic
- diagrams (convert to text)

Extract ALL relevant documents into a unified **spec corpus**.

---

# PHASE 1 — Universal Format Normalization

Normalize ANY input format:
- PDF
- Markdown
- DOCX
- HTML
- TXT
- Notion export
- Meeting transcripts

Preserve:
- heading hierarchy
- bullet lists
- formulas
- tables (converted to plaintext)
- code snippets
- invariant definitions

Remove:
- layout noise
- styling artifacts
- watermarks

Output: a clean, canonical **spec_corpus**.

---

# PHASE 2 — Spec Intent IR (Intermediate Representation)

Extract **all intended behavior** into the Spec-IR.

Each extracted item MUST include:
- spec_excerpt
- source_section
- semantic_type
- normalized representation
- confidence score

Extract:

- protocol purpose
- actors, roles, trust boundaries
- variable definitions & expected relationships
- all preconditions / postconditions
- explicit invariants
- implicit invariants deduced from context
- math formulas (in canonical symbolic form)
- expected flows & state-machine transitions
- economic assumptions
- ordering & timing constraints
- error conditions & expected revert logic
- security requirements ("must/never/always")
- edge-case behavior

This forms **Spec-IR**.

See [IR_EXAMPLES.md](resources/IR_EXAMPLES.md#example-1-spec-ir-record) for detailed examples.

---

# PHASE 3 — Code Behavior IR
### (WITH TRUE LINE-BY-LINE / BLOCK-BY-BLOCK ANALYSIS)

Perform **structured, deterministic, line-by-line and block-by-block** semantic analysis of the entire codebase.

For **EVERY LINE** and **EVERY BLOCK**, extract:
- file + exact line numbers
- local variable updates
- state reads/writes
- conditional branches & alternative paths
- unreachable branches
- revert conditions & custom errors
- external calls (call, delegatecall, staticcall, create2)
- event emissions
- math operations and rounding behavior
- implicit assumptions
- block-level preconditions & postconditions
- locally enforced invariants
- state transitions
- side effects
- dependencies on prior state

For **EVERY FUNCTION**, extract:
- signature & visibility
- applied modifiers (and their logic)
- purpose (based on actual behavior)
- input/output semantics
- read/write sets
- full control-flow structure
- success vs revert paths
- internal/external call graph
- cross-function interactions

Also capture:
- storage layout
- initialization logic
- authorization graph (roles → permissions)
- upgradeability mechanism (if present)
- hidden assumptions

Output: **Code-IR**, a granular semantic map with full traceability.

See [IR_EXAMPLES.md](resources/IR_EXAMPLES.md#example-2-code-ir-record) for detailed examples.

---

# PHASE 4 — Alignment IR (Spec ↔ Code Comparison)

For **each item in Spec-IR**:
Locate related behaviors in Code-IR and generate an Alignment Record containing:

- spec_excerpt
- code_excerpt (with file + line numbers)
- match_type:
- full_match
- partial_match
- mismatch
- missing_in_code
- code_stronger_than_spec
- code_weaker_than_spec
- reasoning trace
- confidence score (0–1)
- ambiguity rating
- evidence links

Explicitly check:
- invariants vs enforcement
- formulas vs math implementation
- flows vs real transitions
- actor expectations vs real privilege map
- ordering constraints vs actual logic
- revert expectations vs actual checks
- trust assumptions vs real external call behavior

Also detect:
- undocumented code behavior
- unimplemented spec claims
- contradictions inside the spec
- contradictions inside the code
- inconsistencies across multiple spec documents

Output: **Alignment-IR**

See [IR_EXAMPLES.md](resources/IR_EXAMPLES.md#example-3-alignment-record-positive-case) for detailed examples.

---

# PHASE 5 — Divergence Classification

Classify each misalignment by severity:

### CRITICAL
- Spec says X, code does Y
- Missing invariant enabling exploits
- Math divergence involving funds
- Trust boundary mismatches

### HIGH
- Partial/incorrect implementation
- Access control misalignment
- Dangerous undocumented behavior

### MEDIUM
- Ambiguity with security implications
- Missing revert checks
- Incomplete edge-case handling

### LOW
- Documentation drift
- Minor semantics mismatch

Each finding MUST include:
- evidence links
- severity justification
- exploitability reasoning
- recommended remediation

See [IR_EXAMPLES.md](resources/IR_EXAMPLES.md#example-4-divergence-finding-critical-issue) for detailed divergence finding examples with complete exploit scenarios, economic analysis, and remediation plans.

---

# PHASE 6 — Final Audit-Grade Report

Produce a structured compliance report:

1. Executive Summary
2. Documentation Sources Identified
3. Spec Intent Breakdown (Spec-IR)
4. Code Behavior Summary (Code-IR)
5. Full Alignment Matrix (Spec → Code → Status)
6. Divergence Findings (with evidence & severity)
7. Missing invariants
8. Incorrect logic
9. Math inconsistencies
10. Flow/state machine mismatches
11. Access control drift
12. Undocumented behavior
13. Ambiguity hotspots (spec & code)
14. Recommended remediations
15. Documentation update suggestions
16. Final risk assessment

---

## Output Requirements & Quality Standards

See [OUTPUT_REQUIREMENTS.md](resources/OUTPUT_REQUIREMENTS.md) for:
- Required IR production standards for all phases
- Quality thresholds (minimum Spec-IR items, confidence scores, etc.)
- Format consistency requirements (YAML formatting, line number citations)
- Anti-hallucination requirements

---

## Completeness Verification

Before finalizing analysis, review the [COMPLETENESS_CHECKLIST.md](resources/COMPLETENESS_CHECKLIST.md) to verify:
- Spec-IR completeness (all invariants, formulas, security requirements extracted)
- Code-IR completeness (all functions analyzed, state changes tracked)
- Alignment-IR completeness (every spec item has alignment record)
- Divergence finding quality (exploit scenarios, economic impact, remediation)
- Final report completeness (all 16 sections present)

---

# ANTI-HALLUCINATION REQUIREMENTS

- If the spec is silent: classify as **UNDOCUMENTED**.
- If the code adds behavior: classify as **UNDOCUMENTED CODE PATH**.
- If unclear: classify as **AMBIGUOUS**.
- Every claim must quote original text or line numbers.
- Zero speculation.
- Exhaustive, literal, pedantic reasoning.

---

# Resources

**Detailed Examples:**
- [IR_EXAMPLES.md](resources/IR_EXAMPLES.md) - Complete IR workflow examples with DEX swap patterns

**Standards & Requirements:**
- [OUTPUT_REQUIREMENTS.md](resources/OUTPUT_REQUIREMENTS.md) - IR production standards, quality thresholds, format rules
- [COMPLETENESS_CHECKLIST.md](resources/COMPLETENESS_CHECKLIST.md) - Verification checklist for all phases

---

# END OF SKILL

By Trail of Bits

How to Use This Skill Unit

Option A: Project-Specific (Recommended)

Click "Download" above
In your project, create the directory: .agent/skills/spec-to-code-compliance/
Save the file as SKILL.md
The agent will automatically discover the skill based on its description.

Option B: Global Installation (All Agents)

Save the file to these locations to make it available across all projects:

Claude Code: ~/.claude/skills/trailofbits/skills/spec-to-code-compliance/SKILL.md
Cursor: ~/.cursor/skills/trailofbits/skills/spec-to-code-compliance/SKILL.md
Antigravity: ~/.gemini/antigravity/skills/trailofbits/skills/spec-to-code-compliance/SKILL.md

🚀 Install with CLI:
npx skills add trailofbits/skills

Read the Master Guide: Mastering Agent Skills →

Related Skill Units

audit-context-building

code analysissecurity auditcontext building+5

# Deep Context Builder Skill (Ultra-Granular Pure Context Mode) ## 1. Purpose This skill governs **how Claude thinks** during the context-building phase of an audit. When active, Claude will: - Perform **line-by-line / block-by-block** code analysis by default. - Apply **First Principles**, **5 W...

Recommended Rules

View more rules →

Recommended Workflows

View more workflows →

Security Hardening Checklist

SecurityHeadersCSP

--- description: Essential security headers, CSP, and rate limiting --- 1. **Security Headers (`next.config.js`)**: - Add these headers to prevent...

Supabase Row Level Security (RLS)

SupabaseDatabaseSecurity

--- description: Define secure database policies to protect user data --- 1. **Enable RLS**: - Always enable RLS on every table you create. ```...

Implement Rate Limiting

SecurityRate LimitingAPI

--- description: Protect APIs with rate limits --- 1. **Install Upstash**: // turbo - Run `npm install @upstash/ratelimit @upstash/redis` 2. *...

Recommended MCP Servers

View more MCP servers →

Secureframe

Official

Query security controls, monitor compliance tests, and access audit data across SOC 2, ISO 27001, CMMC, FedRAMP, and other frameworks from [Secureframe](https://secureframe.com).

Fleet

Community

Full Fleet integration for device management, security monitoring, and compliance enforcement. Supports host management, live query execution, policy management, software inventory, vulnerability tracking, and MDM operations. Supports Read-Only and Read-Write modes.

Heurist Mesh Agent

Community

Access specialized web3 AI agents for blockchain analysis, smart contract security, token metrics, and blockchain interactions through the [Heurist Mesh network](https://github.com/heurist-network/heurist-agent-framework/tree/main/mesh).

Take It Further

Maximize your productivity with these powerful resources

📋

Define Your Standards

Set up coding standards to ensure this workflow produces consistent, high-quality results.

Browse Rules Library

📖

Master Workflows

Learn how to create custom workflows, use Turbo Mode, and build your automation library.

Complete Guide