Architecture
Pipeline
Section titled “Pipeline”Every retrieval follows this fixed sequence:
Task Text + Role Overlay ↓ Query Builder ← overlay vocabulary expansion ↓ ┌─────────────┐ │ Lexical (BM25) │ ← SQLite FTS5 │ Semantic │ ← Ollama embeddings (optional) └─────────────┘ ↓ Merge + Dedupe ← union candidate pool ↓ Metadata Filter ← forbidden sources, role exclusion, trust boosts ↓ Role Reranker ← overlay-weighted scoring + source diversity ↓ Bundle Assembly ← governed output with full audit trailStorage Layer
Section titled “Storage Layer”SQLite + FTS5 via better-sqlite3.
Three tables:
documents— source metadata (trust tier, domain, freshness)chunks— content with tags, applicable/excluded roleschunks_fts— BM25 full-text index (porter stemming, unicode)embeddings— optional vector storage for semantic search
Query Builder
Section titled “Query Builder”The query builder expands task text using the overlay’s vocabulary:
- Extract key terms from task (stop-word removal)
- Inject role-signature phrases (top 3 boost phrases regardless of task overlap)
- Add task-relevant boost phrases
- Expand synonyms from overlay vocabulary
- Build FTS5 query (OR-joined for BM25 ranking)
- Build semantic query with role mission context
The signature phrase injection is what makes roles retrieve domain-relevant material even on generic tasks.
Metadata Filter
Section titled “Metadata Filter”This is where overlay governance becomes retrieval behavior:
| Check | Effect |
|---|---|
| Forbidden source | Hard reject with forbidden_source reason |
| Excluded role | Hard reject with role_mismatch reason |
| Applicable role match | 1.3x boost |
| Role mismatch | Penalty (configurable via role_mismatch_penalty) |
| Stale content | Penalty (configurable via stale_penalty) |
| Trust tier | Multiplicative boost from overlay config |
| Document type | Multiplicative boost from overlay config |
| Preferred source | Multiplicative boost with stated reason |
Every rejected candidate gets a recorded reason — no silent filtering.
Reranker
Section titled “Reranker”Transparent weighted model:
final = lexical * w1 + semantic * w2 + normalize(metadata) * w3 + vocab_hits * w4Weights come from the overlay’s retrieval_policy. Default split: 0.3 / 0.3 / 0.2 / 0.2.
Source diversity pressure: when require_source_diversity is true, additional chunks from the same source are penalized after max_chunks_per_source is reached. This prevents a single source from filling the whole bundle.
Bundle Contract
Section titled “Bundle Contract”The RetrievalBundle is the governed output. It answers:
- What was searched — full query trace with overlay rules applied
- What was selected — scored chunks with reasons and overlay hits
- What was rejected — every excluded candidate with a coded reason
- Is the evidence trustworthy — provenance posture (strong/mixed/weak)
- Is the evidence fresh — freshness posture (fresh/mixed/stale)
- What went wrong — warning codes for degraded scenarios
This is what downstream consumers (Role OS dispatch, prompt builder) use. They never reach back into the corpus.