Welcome
Welcome to the Vocal Synth Engine handbook. This guide covers everything you need to know to synthesize singing voices, play live, and collaborate with others.
What is Vocal Synth Engine?
Section titled “What is Vocal Synth Engine?”Vocal Synth Engine is a deterministic vocal instrument engine built in TypeScript. It renders singing voices from score data using additive synthesis, voice presets, and real-time WebSocket streaming. You can play live via keyboard or MIDI, collaborate in multi-user jam sessions, or render scores to WAV.
Key capabilities
Section titled “Key capabilities”- Additive vocal synthesis with harmonic partials, spectral envelopes, and noise residual
- 15 voice presets from Kokoro TTS analysis artifacts and lab experiments
- Polyphonic rendering with configurable max polyphony and voice stealing
- Live mode with keyboard, MIDI, and real-time WebSocket audio streaming
- Multi-user jam sessions with host authority, track ownership, and recording
- Score input for automatic playback synced to transport
- Recording and export to WAV with full provenance tracking
- Lyrics and phonemes via grapheme-to-phoneme pipeline
- Cockpit UI with piano roll editor, live keyboard, XY pad, and render bank
- Deterministic output using seeded RNG for reproducible renders
Handbook contents
Section titled “Handbook contents”| Page | Covers |
|---|---|
| Getting Started | Installation, dev server, first render |
| Architecture | Engine design, directory layout, synthesis pipeline |
| Cockpit and Jams | Cockpit UI tabs, jam session protocol |
| Voice Presets | Preset catalog, timbre data, manifest format |
| API Reference | REST endpoints, WebSocket paths, auth |