Skip to content

Welcome

Welcome to the Vocal Synth Engine handbook. This guide covers everything you need to know to synthesize singing voices, play live, and collaborate with others.

Vocal Synth Engine is a deterministic vocal instrument engine built in TypeScript. It renders singing voices from score data using additive synthesis, voice presets, and real-time WebSocket streaming. You can play live via keyboard or MIDI, collaborate in multi-user jam sessions, or render scores to WAV.

  • Additive vocal synthesis with harmonic partials, spectral envelopes, and noise residual
  • 15 voice presets from Kokoro TTS analysis artifacts and lab experiments
  • Polyphonic rendering with configurable max polyphony and voice stealing
  • Live mode with keyboard, MIDI, and real-time WebSocket audio streaming
  • Multi-user jam sessions with host authority, track ownership, and recording
  • Score input for automatic playback synced to transport
  • Recording and export to WAV with full provenance tracking
  • Lyrics and phonemes via grapheme-to-phoneme pipeline
  • Cockpit UI with piano roll editor, live keyboard, XY pad, and render bank
  • Deterministic output using seeded RNG for reproducible renders
PageCovers
Getting StartedInstallation, dev server, first render
ArchitectureEngine design, directory layout, synthesis pipeline
Cockpit and JamsCockpit UI tabs, jam session protocol
Voice PresetsPreset catalog, timbre data, manifest format
API ReferenceREST endpoints, WebSocket paths, auth