Skip to content

Deployment

Vocal Synth Engine ships three deployment targets out of the box. They all wrap the same Express server (src/server/index.prod.ts) and the same bundled cockpit (apps/cockpit/dist/).

TargetManifestNotes
DockerDockerfileMulti-stage build, non-root vsynth user, persistent volume at /data/renders.
Fly.iofly.tomlAuto-stop machines, edge probe at /api/health, 1 GB volume mount.
Renderrender.yamlDocker-runtime, free starter plan, disk mount at /data/renders.

The shipped Dockerfile is a multi-stage build on node:20.18-slim:

Terminal window
docker build -t vocal-synth-engine .
docker run -p 4321:4321 \
-e AUTH_TOKEN=$(openssl rand -hex 32) \
-v $(pwd)/renders:/data/renders \
vocal-synth-engine

The container:

  • Runs as a non-root user (vsynth).
  • Listens on :4321 by default (PORT env var to override).
  • Bakes presets from presets/ into the image.
  • Persists render artifacts to /data/renders (mount a volume).
  • Includes a HEALTHCHECK that hits /api/health every 30 s (wget --spider).

To pin to an immutable image digest, replace node:20.18-slim with node:20.18-slim@sha256:<digest>. See SCORECARD.md for the open digest-pin upgrade path.

Deploy from a clone of the repo:

Terminal window
fly launch --copy-config --no-deploy # if you haven't already created the app
fly secrets set AUTH_TOKEN=$(openssl rand -hex 32)
fly volumes create render_bank --region iad --size 1
fly deploy

Highlights from the shipped fly.toml:

  • app = "vocal-synth-cockpit", primary region iad.
  • internal_port = 4321, force_https = true.
  • auto_stop_machines = "stop" + min_machines_running = 0 → idle cost is zero. First-request cold-start is covered by grace_period = "10s" on the health check.
  • [[http_service.checks]] polls GET /api/health every 15 s (timeout 2 s).
  • [[mounts]] maps the render_bank volume to /data/renders — required for RENDER_STORE_DIR.
  • VM: shared-cpu-1x, 512 MB.

Bump up [[vm]] memory if you raise RENDER_QUEUE_MAX_DEPTH or run long renders.

Drop the included render.yaml into a Render Blueprint:

Terminal window
git push # Render auto-deploys on push to the configured branch
  • Service type web, Docker runtime, Ohio region, starter plan.
  • healthCheckPath: /api/health for zero-downtime deploys.
  • Disk render-bank mounted at /data/renders (1 GB).
  • AUTH_TOKEN is declared with sync: false — set it manually in the Render dashboard, never commit it.

Read by src/server/. Most have safe defaults; set the ones your deployment cares about.

VariableDefaultPurpose
NODE_ENVdevelopmentSet to production for the prod server. Affects logging and the auth open-mode warning.
PORT4321TCP port the server listens on.
APP_VERSIONfrom package.jsonSurfaced in /api/health and /metrics.
GIT_COMMITunsetOptional. Surfaced in /api/health/detailed for traceability.
LOG_LEVELinfoPino log level: trace / debug / info / warn / error / fatal.
TRUST_PROXYunsetSet to 1 (or a CIDR list) when behind Fly / Render / nginx so req.ip is correct.
VariableDefaultPurpose
AUTH_TOKENunsetSingle bearer token. When unset all /api/* routes are open (warned at boot in production).
AUTH_KEYSunsetComma-separated id:token list for per-principal auth. Tokens are hashed once at boot.
AUTH_KEYS_FILEunsetPath to a JSON file with the same id:token mapping. Useful for secret managers that mount as files.
ALLOWED_ORIGINunsetCORS Origin allowlist (comma-separated). Default is no cross-origin; set this only if a different origin needs API access.
VariableDefaultPurpose
RENDER_STORE_DIR./rendersDisk path where saved renders live. Must be a persistent volume in production.
RENDER_STORE_BUDGET_MB512Total disk budget for the render bank. New renders are rejected with RENDER_STORE_FULL when over.
RENDER_PER_BUDGET_MB64Per-render size cap. Larger renders are rejected with RENDER_TOO_LARGE.
MAX_RENDER_DURATION_SEC300Hard cap on score duration.
RENDER_QUEUE_MAX_DEPTH8Max concurrent + queued render jobs. Excess is rejected with RENDER_QUEUE_FULL.
MAX_NOTES_PER_SCORE8192Hard cap on score.notes.length.
VariableDefaultPurpose
RATE_LIMIT_RPM20Per-IP requests per minute on render + renders + phonemize.
RATE_LIMIT_MAX_IPS10000LRU cap on the per-IP counter map.
JSON_BODY_LIMIT1mbExpress body-parser size cap.
SLOW_REQUEST_MS1000Log threshold for slow_request warnings.
VariableDefaultPurpose
MAX_LIVE_SESSIONS4Concurrent /ws connections.
MAX_JAM_SESSIONS8Concurrent jam rooms.
MAX_JAM_PARTICIPANTS8Per-room participant cap.
MAX_PHONEMIZE_NOTES512Cap on notes.length per /api/phonemize request.
MAX_LYRICS_LENGTH4096Cap on lyric text length per request.
WS_PING_INTERVAL_MS30000Heartbeat ping interval for /ws + /ws/jam.
WS_MAX_MISSED_PINGS2Disconnect after this many missed pongs.
VariableDefaultPurpose
PRESET_DIR./presetsDirectory containing <presetId>/voicepreset.json + assets.

Two endpoints:

  • GET /api/health — public, returns { ok, version, uptimeSec }. Use this for load-balancer probes.
  • GET /api/health/detailed — auth-required (when AUTH_TOKEN is set). Returns commit hash, render store budget, queue depth, active sessions, and process resource usage.

Production deployments should expose only /api/health to the public edge. The shipped fly.toml, render.yaml, and Dockerfile all probe this endpoint.

The server listens for SIGTERM and SIGINT and performs a clean shutdown:

  1. Stop accepting new HTTP connections.
  2. Drain in-flight renders (waits up to RENDER_QUEUE_MAX_DEPTH × MAX_RENDER_DURATION_SEC).
  3. Close /ws and /ws/jam sessions with a 1001 going_away.
  4. Flush logs and exit.

Platforms that send SIGKILL after a grace period (Fly’s default is 30 s; Render’s is 30 s) may cut this short — set MAX_RENDER_DURATION_SEC accordingly so an in-flight render finishes inside the grace window. Long renders should be backed by a queue worker, not a foreground HTTP call.

If your cockpit is served from a different origin than the daemon, set ALLOWED_ORIGIN to the cockpit origin (comma-separated for multiple). When unset the server emits no CORS headers — same-origin requests work, cross-origin requests fail at the browser preflight.

The render store is the only stateful surface. Mount a persistent volume at RENDER_STORE_DIR in production:

  • Fly.io[[mounts]] block (already wired in fly.toml).
  • Renderdisk: block (already wired in render.yaml).
  • Docker-v flag at run time.

If you skip the volume, every redeploy / machine restart loses saved renders. The “Last Render” placeholder is in-memory only and is always lost on restart by design.