Skip to content

Architecture

db-cluster is a federated database cluster. Not a single database with plugins. Not a vector store with metadata. Not an AI wrapper.

An AI system should not query one flattened database. It should operate over a cluster of specialized truth stores, where each store preserves its native truth shape and the cluster exposes one coherent retrieval, provenance, and mutation surface.

StoreOwnsShapeDerivative?
CanonicalEntities — stable IDs, structured state{id, kind, name, attributes, owner: 'canonical'}No — owner truth
ArtifactRaw files — documents, source text, uploads{id, filename, contentHash, mimeType, owner: 'artifact'}No — owner truth
IndexDiscoverability — full-text, metadata search{id, sourceId, sourceStore, text, metadata, owner: 'index'}Yes — rebuildable from canonical + artifact
LedgerHistory — provenance events, mutation receipts{id, action, actorId, subjectId, timestamp, owner: 'ledger'}No — owner truth

The index store can be destroyed and rebuilt from canonical + artifact truth without losing any cluster state. It is the only derivative store. db-cluster rebuild index produces an identical index from owned stores; db-cluster verify confirms the rebuild is loss-free.

This is the load-bearing law. Indexes can lie — they may stale, they may be wrong about a name, they may miss a record. But canonical + artifact + ledger can rebuild any index from scratch.

The kernel routes operations to the correct store. It never holds truth itself.

┌──────────────────────────────────────────────────┐
│ ClusterKernel │
│ │
│ find → index → resolve → canonical/artifact │
│ retrieve → index + canonical + artifact │
│ trace → ledger │
│ mutate → command lifecycle → canonical/artifact │
│ receipt → ledger │
└──────────────────────────────────────────────────┘

The kernel enforces:

  • Retrieval always resolves to owner truth — index records are never returned as final answers.
  • Mutations always cross a command boundary — no direct store writes.
  • Provenance is always emitted — every write produces a ledger event.
  • Receipts prove operations — every committed mutation gets a receipt.

Every object in the cluster has a URI:

cluster://canonical/<entity-id>
cluster://artifact/<artifact-id>
cluster://index/<record-id>
cluster://ledger/<event-id>
cluster://receipt/<receipt-id>

URIs identify the owner store. Resolving a URI always returns owner truth — never an index projection.

MisreadingReality
RAG pipelineRetrieval resolves to owner truth, not vector similarity.
AI memory layerEntities have structured state, not conversation history.
SQL assistantMutations require typed commands, not natural language.
Vector databaseIndex is derivative; the cluster owns structured truth.
Governance middlewarePolicy and provenance are native, not bolted on.

Stores have logical contracts and physical backends:

  • The canonical store contract is the same whether backed by local JSON or Postgres.
  • Physical backends implement store law — they do not become the product center.
  • Backend choice is invisible to the kernel, SDK, MCP, and CLI.

Currently supported:

  • Local (JSON files) — all four stores.
  • Postgres — canonical store only (artifact / index / ledger remain local).

The Postgres adapter attaches a pool.on('error', …) handler (so an idle-client RST doesn’t crash the process) and uses INSERT … ON CONFLICT to close the TOCTOU window on concurrent imports. It is also append-a-version (parity with the local store): each entity id owns one immutable row per version. SSL/TLS is not configured by db-cluster in v1.0.0 — the connection is plaintext unless your connection string enforces TLS (e.g. sslmode=require, which the pg driver honours), a TLS proxy, or a private network. Driver-managed ssl config is planned for a future release.

CLI / SDK / MCP ← surfaces (operator, developer, AI-agent)
PolicyEnforcedKernel ← policy + redaction (the root's createSafeCluster handle)
ClusterKernel ← routing, retrieval, mutation lifecycle
┌─────┼─────┬─────────┐
│ │ │ │
Canonical Artifact Index Ledger ← stores
(Postgres (local) (local) (local)
or local)

db-cluster is policy-enforced by default. The package root factory createSafeCluster() returns a handle whose only door to cluster truth is a PolicyEnforcedKernel — there is no policy-bypass code path through the default public surface, and the raw ClusterKernel class is not exported. Raw, unpoliced stores are reachable only via the explicit @mcptoolshop/db-cluster/unsafe escape hatch (operator tooling and tests), which deliberately bypasses policy/receipts/provenance.

  • Mutation lifecycle — propose → validate → approve → commit → (compensate). See SDK reference.
  • Provenance graphkernel.traceObject(uri) returns a ProvenanceGraph with nodes (entity, artifact, index_record, provenance_event, receipt, command, evidence_bundle) and edges (11 variants).
  • PolicyPrincipal + Capability + Policy + TrustZone + VisibilityRule. See Policy & Redaction.
  • Redaction — applied at every read path through PolicyEnforcedKernel. Marker types are explicit; the redactor uses an allowlist, not a denylist.
  • Typed errorsClusterError base + per-class subclasses, each with code / remediationHint / retryable. CLI exit codes map sysexits.h.