Grounding & Formal Methods

Five tools covering the stages most teams skip under pressure: discover what to build, require what it must do, specify its states and invariants, restructure what exists safely, and review the reasoning that produced the code. AI removes the time penalty from each stage. These tools put that hypothesis into practice — and are built to help engineers learn the underlying disciplines, not just apply them.

PR/FAQ

v1.6.0

Amazon's Working Backwards process, inside the terminal.

Generate, review, and stress-test Amazon Working Backwards documents with specialized agents, simulated review meetings, and compiled PDF output. See an example — the PR/FAQ for this tool, written by the tool itself.

Uses Quarry

Complete PR/FAQ document generation — press release, FAQs, four-risks assessment, feature appendix
Simulated review meeting with four agentic personas who debate weak spots
Researcher agent integrates with Quarry for semantic search
Cascading feedback engine traces effects across all sections
Compiles to print-ready PDF via LaTeX

$ curl -fsSL https://raw.githubusercontent.com/punt-labs/prfaq/ffc7dfd/install.sh | sh

Coming Soon

Use Cases

v0.0.0 alpha

Actors. Goals. Scenarios. Extensions. The spec that comes before the code.

Use Cases will seek to bring Jacobson and Cockburn's structured requirements methodology into Claude Code. The goal: identify the actors in your system, define what each is trying to accomplish, describe the step-by-step scenario that achieves their goal, and systematically work through what can go wrong at each step. The output is a specification document — not code.

Guided walkthrough: actors, goals, scenarios, extensions
Cockburn's goal-level hierarchy — summary, user, subfunction
Systematic extension handling at every scenario step
Specification document output, not generated code

Z Spec

v0.2.0 beta

Formal Z specifications for stateful systems.

Z is a specification language grounded in set theory and first-order predicate logic, developed at Oxford and standardized as ISO 13568. A Z spec describes a system as states, invariants that constrain them, and operations that transition between them — catching entire classes of bugs mathematically, not just the inputs you happened to test. Z Spec makes two powerful tools approachable inside Claude Code: Spivey's fuzz type-checker and the ProB animator/model-checker from HHU Düsseldorf. Extract specs from existing code, generate code from specs, type-check, and model-check — without learning the toolchain.

Bidirectional: code2model extracts specs, model2code generates implementations
Type-checks with fuzz — Spivey's Z type-checker from Oxford
Animates and model-checks with ProB — state-space exploration from HHU Düsseldorf
Derives test cases from specs using TTF testing tactics
Lean 4 proof obligations, runtime contracts, and property-based oracle testing
B-Method support: create, type-check, animate, and refine B machines

$ curl -fsSL https://raw.githubusercontent.com/punt-labs/z-spec/02d6125/install.sh | sh

Coming Soon

Refactory

v0.0.0 alpha

refactoring Behavior-preserving program transformation, not just "clean up code."

Refactory will seek to bring Opdyke's original definition of refactoring — behavior-preserving program transformations with formally defined preconditions — into an AI-assisted workflow. The AI decides what needs restructuring; Refactory would execute on a parsed program model where each transformation is proven safe before it runs.

Built on Opdyke's formal definition — each refactoring has preconditions that must hold
Operates on a parsed program model, not text substitution
AI selects the transformation; Refactory guarantees safety
Composable — chain small, proven-safe transformations into large restructurings

Coming Soon

ReasonTrace

v0.0.0 alpha

provenance Session recordings as first-class engineering assets.

ReasonTrace will seek to turn every AI coding session into a searchable, reviewable engineering asset. The idea: capture the full terminal recording and a structured event log, extract reasoning decisions, and run automated review agents that check the process against project standards and formal specifications. Existing tools capture what changed (Git) or what was decided (Entire.io). ReasonTrace would capture how the agent actually reasoned — and whether that reasoning was sound.

Reviews against Z Spec

Dual-stream capture — terminal recording for humans, structured event log for agents
Reasoning extraction — tool calls, decision points, error-recovery cycles, phase transitions
Automated review agents — check reasoning against standards, policies, and formal specs
Semantic search across sessions — find decisions by concept, not just keyword