backprop-trace verifies one training step.
Hand it a receipt naming every factor that contributed to one gradient update. The reconciler re-derives every claim from those factors and rejects on disagreement. In the Csmith/CompCert lineage: the oracle must not consult the artifact it judges. v1.0.0 covers the deterministic-CPU corner end to end — SGD/Adam/AdamW/SGD-momentum plus SGD coupled-L2 weight decay, live PyTorch and JAX helpers, a recognizable hero classifier fixture, rule-coverage observability, and a worked EU AI Act compliance bundle.
Verify
npx bp verify mazur
# exit 0 — schema + reconcile + engine-reproduce + byte-equal-vs-golden
Reject
npx bp reconcile receipt fixtures/bad/mazur.bad-gradient.jsonl
# exit 1 — Rule 4: update.gradient mismatch on w5
Auditable PASS
npx bp verify general fixtures/hero-classifier.golden.jsonl --json
# exit 0 — 9-pixel -> 16-ReLU -> 4-class softmax glyph classifier
# --json carries rules_evaluated / gated_off: WHICH rules the PASS exercised
What it does
26 rules. Per-step structural consistency. Adversarial corpora prove the verifier — every rule ships with a paired bad fixture. v1.0.0 covers the deterministic-CPU corner end to end.
26-rule reconciler
Re-derives gradients, error signals, parameter updates, Adam/AdamW moment state, and PyTorch-style SGD momentum buffer (classical + Nesterov + dampening) from named factors. Within hybrid tolerance (atol + rtol) — clamped to a verifier-owned ceiling before any rule runs. A receipt can tighten its pass band, never loosen it.
SGD coupled-L2 weight decay (v1.0.0)
Rule 7’s third branch covers plain SGD and SGD-momentum with weight_decay > 0. Coupled, not decoupled — the decay folds into the gradient and enters the momentum buffer, the deliberate opposite of AdamW. Cross-family-verified math, validated against real PyTorch. New additive schemas receipt.v0.8.0 + framework-trace.v0.8.0.
Live PyTorch and JAX helpers (observer-only)
Single auditable Python files. scripts/extract/pytorch.py covers SGD/Adam/AdamW/sgd_momentum (with the momentum_buffer ascent→descent sign-flip). scripts/extract/jax.py adds a stronger trust boundary — it folds a jax.make_jaxpr(jax.grad(loss)) digest into the forensic block (the inspectable gradient graph PyTorch eager lacks) and refuses to run without jax_enable_x64 + CPU. Rule 14 (engine-recompute) is the authority on every imported sidecar.
Recognizable hero fixture (v1.0.0)
fixtures/hero-classifier.golden.jsonl — a 9-pixel → 16-ReLU → 4-class softmax glyph classifier, single and multi-step, byte-reproducible on CPU. A receipt a reader can picture, not just a 2-2-2 toy. The v1.0 gate fixture, now shipped.
Rule-coverage observability (v1.0.0)
bp verify --json / --verbose now reports which of the 26 rules a PASS actually exercised (rules_evaluated) and which were applicable-but-gated-off because a feature block was absent (gated_off). A green PASS becomes auditable — you can see the 9 substantive rules that ran, not just trust that 26 might have.
Compliance audit bundle (v1.0.0)
A worked mapping of a receipt to EU AI Act Annex IV §2(g)/§2(b) test-log records and Article 15 robustness controls — honestly NOT Article 10 (data governance is out of scope). A receipt is the numerical-consistency leaf of a compliance tree; it composes below SLSA-for-ML / Sigstore, it does not replace them. See docs/compliance.md.
Sidecar ingestion
bp import pytorch | jax | tensorflow — single-step, multi-step (Rules 9/10), batched (Rules 18/19), Adam moments (22-24), SGD momentum (20/21/25/26). Per-step engine-recompute differential (Rule 14) is mandatory and marker-gated; multi-step self-declared skips are NON-PASS. float32 observer receipts pass within the observer tolerance band.
Canonical JSONL + distribution integrity
Decimal strings, schema-defined key order, 9-sig-fig byte-equal on Node 22.x, in-toto v1 attestation seam. pack-install smoke runs on every push across ubuntu + macos + windows — tarball contents, cold install, CLI behavior, stdin pipe semantics, all CI-gated. A dedicated jax-e2e job validates the JAX helper against real JAX.
v1.0.0 — the sober promotion
A comprehensive dogfood swarm took backprop-trace from honest mid-v0 to a v1.0.0 that meets the product’s own gate criteria — without overclaiming. Tests 792 → 940. CPU-only, deterministic. Conv / GPU stay out of the deterministic corner by design.
| v1.0.0 surface | What ships | Honest scope |
|---|---|---|
| Optimizers | SGD, Adam, AdamW, SGD-momentum (classical / Nesterov / dampening), and SGD coupled-L2 weight decay (Rule 7 third branch). | Coupled L2 folds decay into the gradient + momentum buffer — the deliberate opposite of AdamW decoupled decay. An unrecognized optimizer name is a Rule 0 structural reject, never a silent skip. |
| Live helpers | PyTorch (scripts/extract/pytorch.py) and JAX (scripts/extract/jax.py) — single auditable files, copy-and-read, no pip package. | Observer-only. Rule 14 (engine-recompute differential) is the authority on every sidecar. The JAX helper enforces jax_enable_x64 + CPU and records a make_jaxpr(grad) digest as a forensic — not credential — trust signal. |
| Hero fixture | fixtures/hero-classifier.golden.jsonl — a 9-pixel → 16-ReLU → 4-class softmax glyph classifier, single + multi-step. | A recognizable dense ReLU→softmax classifier, byte-reproducible on CPU. The v1.0 gate fixture. Conv / multi-hidden-layer topologies remain out — they fight bit-determinism. |
| Auditable PASS | bp verify --json / --verbose reports rules_evaluated and gated_off. | A green PASS records which substantive rules actually fired vs which were gated off because a feature block was absent — so an auditor sees the coverage, not just the verdict. |
| Compliance bundle | docs/compliance.md — a receipt mapped to EU AI Act Annex IV §2(g)/§2(b) + Article 15, wrapped in an in-toto v1 statement. | A receipt attests math, not data governance. It is the numerical-consistency leaf below model-signing — honestly NOT Article 10. Compose, don’t conflate. |
| Determinism boundary | Byte-equal on Node 22.x across ubuntu + macos + windows. A Math.exp(-0.5) canary fires on every CI cell. | GPU / fused-kernel bit-determinism is permanently out of scope (FP non-associativity, arXiv:2408.05148). The product is the deterministic CPU corner. |
Quick start
Install
pnpm add @mcptoolshop/backprop-trace
# or: npm install @mcptoolshop/backprop-trace Accept a good receipt
npx bp verify mazur
# exit 0 — Mazur 2-2-2 backprop walkthrough (Matt Mazur 2015)
# every number derivable by hand Reject a broken one
npx bp reconcile receipt \
node_modules/@mcptoolshop/backprop-trace/fixtures/bad/mazur.bad-gradient.jsonl
# exit 1 — Rule 4: update.gradient mismatch on w5
# (rejected BEFORE the verifier reads fixture_status) Verify the hero classifier (auditable PASS)
npx bp verify general \
node_modules/@mcptoolshop/backprop-trace/fixtures/hero-classifier.golden.jsonl --json
# exit 0 — 9-pixel -> 16-ReLU -> 4-class softmax glyph classifier
# --json carries rules_evaluated / gated_off:
# {"overall":"PASS","rules_evaluated":[1,2,3,4,5,6,7,8,11,12,13],"gated_off":[...]} Verify your own PyTorch training step
npx bp examples pytorch --print > pytorch_trace_helper.py
# in your training loop:
from pytorch_trace_helper import TraceDumper
dumper = TraceDumper(model, optimizer, loss_fn, out="trace.jsonl")
for x, y in loader:
with dumper.step(inputs={...}, targets={...}):
optimizer.zero_grad()
loss_fn(model(x), y).backward()
optimizer.step()
# verify:
npx bp import pytorch trace.jsonl | npx bp verify multi - Verify a JAX training step
# copy the single auditable file, read it, run it:
cp node_modules/@mcptoolshop/backprop-trace/scripts/extract/jax.py jax_trace_helper.py
# requires jax_enable_x64 + CPU (the helper refuses otherwise)
# from jax_trace_helper import TraceDumper
# with dumper.step(inputs=x, targets=y) as ctx: params = ctx.run(params)
npx bp import jax trace.jsonl | npx bp verify multi -