Skip to content

Features

Ollama is automatically started if it isn’t running. The TranslateGemma model is automatically pulled if it isn’t installed. Zero manual setup required after initial Ollama installation.

Transient Ollama failures (network blips, temporary overload) are automatically retried up to 2 times with exponential backoff (1s, 2s). Non-retryable errors (bad model name, invalid input) fail immediately.

Long text is split at natural boundaries — paragraphs, then sentences — so translation context is preserved. Chunk sizes adapt to the model:

ModelChunk size
2B / 4B2K chars
12B4K chars
27B6K chars

Translated segments are cached by content hash (SHA-256 of source text + target language + model). Unchanged segments skip re-translation entirely. Cache lives in .polyglot-cache.json with a 30-day TTL.

A built-in glossary of 12 technical terms (API, CLI, SDK, etc.) ensures consistent translation of software terminology. Custom glossary entries can be passed per-request and are merged with the defaults.

translateBatch groups multiple segments into a single prompt where possible, reducing round-trips. Falls back to individual translation if the batch separator is mangled.

Every translation is automatically validated:

  • Empty output throws (retryable)
  • Source-text echo is flagged
  • Severe truncation and hallucination blowup are warned
  • Garbled encoding and model meta-commentary are detected
  • Warnings appear in the MCP tool response

OllamaClient.generateStream() yields tokens via NDJSON as Ollama produces them. The translate() function accepts an onToken callback for real-time progress display.

All errors use PolyglotError with a machine-readable code (MODEL_NOT_FOUND, OLLAMA_UNAVAILABLE, TRANSLATION_FAILED, etc.), a human-readable message, an optional hint, and a retryable flag.