Architecture
Mental model
Section titled “Mental model”AI-UI answers one question: “Can users find the features your docs promise?”
It does this in three phases:
- Extract — Parse your docs into a feature catalog (atlas)
- Discover — Crawl your UI and record every interactive trigger (probe)
- Match — Fuzzy-match features to triggers and compute coverage (diff)
Everything after that — graph, design-map, goals, replays — is analysis on top of these three inputs.
Core concepts
Section titled “Core concepts”- Feature — A capability your docs claim to have. Extracted from markdown headings and content.
- Trigger — An interactive UI element. Buttons, links, inputs, checkboxes, dialogs.
- Coverage — The percentage of documented features that have at least one discoverable trigger.
- Burial index — How deep a feature is in the navigation hierarchy. Higher = harder to find.
- Goal — An observable effect that proves a user completed a meaningful action (storage write, DOM mutation, fetch call).
- Task flow — An inferred sequence of navigation steps a user might take.
- Design map — The full diagnostic output: surface inventory, feature map, task flows, IA proposal.
Inputs, processing, outputs
Section titled “Inputs, processing, outputs”Inputs Processing Outputs───────────────────── ──────────────────── ──────────────────README.md, docs/*.md → atlas (markdown-it) → atlas.jsonRunning dev server → probe (Playwright) → probe.jsonlWebSketch captures → surfaces → surfaces.jsonatlas + probe → diff (fuzzy match) → diff.json, diff.mddiff + probe + surf → graph (DAG builder) → trigger-graph.jsongraph + config → design-map → design-map.mdgraph + browser → runtime-effects → effects datatwo replay packs → replay-diff → delta reportKey files
Section titled “Key files”| File | Purpose |
|---|---|
cli/bin/ai-ui.mjs | CLI entry point — argument parsing, command dispatch |
cli/src/config.mjs | Config loading, defaults, validation, fail() error helper |
cli/src/atlas.mjs | Doc parser — markdown-it → feature catalog |
cli/src/probe.mjs | Browser crawler — Playwright → trigger records |
cli/src/diff.mjs | Feature-trigger matcher — fuzzy matching + burial index |
cli/src/trigger-graph.mjs | Trigger graph builder — DAG from probe + surfaces + diff |
cli/src/design-map.mjs | Design diagnostic engine — surface inventory, task flows, IA, goal rules |
cli/src/verify.mjs | CI verification — coverage gates, baseline comparison |
cli/src/runtime-effects.mjs | Runtime capture — click triggers, observe side effects |
cli/src/types.mjs | JSDoc typedefs — the type system for the entire pipeline |
Design decisions
Section titled “Design decisions”Why no LLM?
Section titled “Why no LLM?”Determinism. The same docs + the same UI = the same report, every time. LLMs introduce variance that makes CI gates unreliable. If your coverage drops from 64% to 62%, you need to know it’s because a button was removed — not because the model had a different day.
AI-UI uses fuzzy string matching, not semantic similarity. It’s less clever but perfectly reproducible.
Why Playwright?
Section titled “Why Playwright?”Probe needs to see what a real user sees. DOM inspection misses dynamic content, lazy-loaded components, and client-side routing. Playwright gives us a real Chromium browser that renders JavaScript, handles SPA routing, and interacts with the page.
Why JSONL for probe output?
Section titled “Why JSONL for probe output?”Probe can generate thousands of triggers. JSONL (one JSON object per line) makes it easy to stream, filter, and process without loading everything into memory. It also makes diffs readable in version control.
Why three-tier goal detection?
Section titled “Why three-tier goal detection?”We don’t want to break existing behavior when adding new features:
- Route goals (always active) — traditional URL-change detection
- Rule goals (when
goalRulesis configured) — effect-based detection for SPAs - Legacy effect goals (fallback when no rules) — binary “did anything happen?” check
If you configure goal rules, the legacy fallback is disabled. If you don’t, everything works exactly as before.
Why split atlas/probe/diff instead of one command?
Section titled “Why split atlas/probe/diff instead of one command?”Each stage is independently useful and independently testable. You might want to:
- Run atlas without a dev server (just parse docs)
- Run probe against a staging environment
- Re-run diff after changing feature aliases
Splitting also makes CI caching possible — if your docs haven’t changed, skip atlas.
Extensibility
Section titled “Extensibility”Adding a new command
Section titled “Adding a new command”- Create
cli/src/my-command.mjswith a default export function - Add it to the command dispatch in
cli/bin/ai-ui.mjs - Add tests in
cli/test/
Adding a new goal rule kind
Section titled “Adding a new goal rule kind”- Add the kind to
GoalRuleKindincli/src/types.mjs - Add a
matchMyRule()function incli/src/design-map.mjs - Wire it into
matchSingleRule() - Add tests
Custom probe behavior
Section titled “Custom probe behavior”Probe is designed around the config file. To change which elements it records, modify the trigger detection logic in cli/src/probe.mjs. The probe doesn’t depend on any other pipeline stage — it’s a standalone crawler.