Skip to content

Operations

Tool Compass runs in two transports: stdio (the default — for Claude Desktop, Cursor, Continue.dev, any MCP client that launches it as a subprocess) and streamable-http (set PORT=N — for Fly.io, Kubernetes, or any remote deployment).

HTTP mode adds three HTTP routes alongside the JSON-RPC transport. This page covers how to use them.

Terminal window
curl http://localhost:8080/health
# {"status": "ok"}

Always returns 200 while the process is up. Wire this to your load balancer’s liveness probe — it answers “is the process alive”, nothing more.

Terminal window
curl http://localhost:8080/ready

Returns 200 only when ALL of the following are true:

  • CompassIndex is loaded with a working SQLite connection.
  • Ollama is reachable (or the circuit breaker is closed).
  • At least one configured backend is connected.

If any check fails, returns 503 with a JSON breakdown:

{
"status": "not_ready",
"checks": {
"index": {"ok": true},
"ollama": {"ok": false, "reason": "circuit_breaker_open"},
"backends": {"ok": true, "connected": 3, "configured": 3}
}
}

Wire this to your readiness probe (not liveness) so a pod that can’t serve queries gets taken out of rotation but not restarted. Result is cached for 30 seconds to avoid DoS-ing Ollama from load-balancer polling.

Terminal window
curl http://localhost:8080/metrics

Returns text/plain; version=0.0.4 Prometheus format. Sampled metrics:

MetricTypeNotes
tool_compass_search_totalcounterTotal compass() calls (from analytics, 24h window)
tool_compass_ollama_availablegauge1 if embedder circuit breaker is closed, else 0
tool_compass_backend_up{name}gauge1 per connected backend
tool_compass_backend_call_total{name,status}counterPer-backend success/error counts
tool_compass_embed_latency_p95_msgaugep95 of last 1000 embed calls
tool_compass_embed_failures_totalcounterTotal embed failures
tool_compass_index_age_secondsgaugeSeconds since last compass_sync
tool_compass_orphaned_vectorsgaugeHNSW vectors without a matching SQLite row

No prometheus_client dependency — the text is built by the gateway itself.

Every @mcp.tool() handler generates an 8-char trace_id (uuid4[:8]) on entry. It’s threaded through:

  • Every logger.info/warning/error call in that request.
  • The trace_id field in the success response envelope.
  • The trace_id field in the error response envelope.
  • Analytics rows where the underlying function signature supports it.

When a user reports “my compass() call failed”, ask them for the trace_id and grep your logs. Example error envelope:

{
"error": "Backend unreachable: kubernetes",
"trace_id": "9f3a1c7b",
"hint": "Run tool-compass doctor to check backend config."
}

Tool Compass prefers degraded over down whenever possible.

If embed_query() fails (network, timeout, breaker open), compass() falls back to SQLite LIKE lexical search over tool names + descriptions. Results are marked degraded: true and the envelope includes a warnings[] array:

{
"matches": [
{"tool": "bridge:read_file", "degraded": true, "confidence": 0.0}
],
"warnings": [
"Semantic search unavailable: Ollama at http://localhost:11434 is unreachable. Try: ollama serve. Showing keyword-based results."
],
"trace_id": "a1b2c3d4"
}

The embedder circuit breaker opens after 3 consecutive failures and stays open for 30 seconds — subsequent calls fast-fail into the fallback instead of each eating a 30-second Ollama timeout.

If the SQLite analytics DB errors (locked, disk full, corrupted schema), analytics writes log a warning once and set a _degraded flag — the primary tool-discovery path keeps working. Surface the state:

from analytics import get_analytics
analytics = get_analytics()
health = analytics.get_health() # {"degraded": bool, "reason": str | None}

If the HNSW file on disk was built with a different embedding_dim (e.g. you switched models), load_index() raises a RuntimeError that names the path and advises deletion:

Index file at db/compass.hnsw uses 1024-dim vectors but code expects 768.
The embedding model likely changed. Delete db/compass.hnsw and run
`tool-compass sync` to rebuild.

The published image supports multi-arch:

Terminal window
docker pull ghcr.io/mcp-tool-shop-org/tool-compass:v2.2.0
# linux/amd64 + linux/arm64 from the same tag

Same tag runs on x86_64 servers and Apple Silicon / ARM workstations without emulation.

See the fly.toml at the repo root. Typical deployment:

Terminal window
fly launch --image ghcr.io/mcp-tool-shop-org/tool-compass:v2.2.0
fly secrets set OLLAMA_URL=https://your-ollama-host
fly deploy

Wire /ready to Fly’s health check so pods that can’t actually serve queries get pulled out of the pool without flapping.