Vocal Synth Engine exposes a REST API and two WebSocket endpoints. All endpoints are served from the same Express server.
Authentication is optional . When the AUTH_TOKEN environment variable is set, protected endpoints require a bearer token:
Authorization: Bearer <your-token>
Endpoints marked “Auth: Yes” in the table below are protected when AUTH_TOKEN is configured. When unset, all endpoints are open.
Path /api/healthMethod GETAuth No Description Server health, version string, and uptime in seconds.
Path /api/presetsMethod GETAuth No Description Returns all voice presets with timbres, pitch ranges, and metadata.
Path /api/phonemizeMethod POSTAuth Yes Description Convert lyrics text to a sequence of phoneme events.
Request body:
Path /api/renderMethod POSTAuth Yes Description Render a VocalScore to WAV. Returns a render ID for retrieving the result.
Request body:
"preset" : " kokoro-af-heart " ,
Path /api/rendersMethod GETAuth Yes Description List all saved renders with metadata.
Path /api/renders/:id/audio.wavMethod GETAuth Yes Description Download the rendered WAV file.
Path /api/renders/:id/scoreMethod GETAuth Yes Description Retrieve the original score JSON used for this render.
Path /api/renders/:id/metaMethod GETAuth Yes Description Render metadata including preset, polyphony, seed, and timing.
Path /api/renders/:id/telemetryMethod GETAuth Yes Description Performance telemetry: peak dBFS, real-time factor, click count.
Path /api/renders/:id/provenanceMethod GETAuth Yes Description Provenance data: commit SHA, score hash, WAV hash, engine config.
Path /wsPurpose Single-user note playback with real-time audio streaming.
The live WebSocket accepts note-on and note-off messages and streams PCM audio blocks back to the client. The cockpit UI’s Live tab uses this endpoint.
Path /ws/jamPurpose Multi-user collaborative sessions with recording.
The jam WebSocket uses a structured JSON protocol. See the Cockpit and Jams page for the full protocol table and session lifecycle.
All API errors return a JSON object:
"message" : " Polyphony limit exceeded " ,
"hint" : " Reduce the number of simultaneous notes or increase the polyphony setting "
Field Type Description codestring Machine-readable error code messagestring Human-readable description hintstring Suggested fix or next step
HTTP status codes follow standard conventions: 400 for bad requests, 401 for missing/invalid auth, 404 for not found, 500 for server errors.