Voice Presets
Vocal Synth Engine ships with 15 voice presets. Each preset is a frozen analysis artifact containing the spectral characteristics of a singing voice.
Preset catalog
Section titled “Preset catalog”| Preset | Voice | Timbres |
|---|---|---|
default-voice | Baseline female | Default timbre |
bright-lab | Lab/experimental | Bright formant |
kokoro-af-aoede | Aoede (female) | Multiple timbres |
kokoro-af-heart | Heart (female) | Multiple timbres |
kokoro-af-jessica | Jessica (female) | Multiple timbres |
kokoro-af-sky | Sky (female) | Multiple timbres |
kokoro-am-eric | Eric (male) | Multiple timbres |
kokoro-am-fenrir | Fenrir (male) | Multiple timbres |
kokoro-am-liam | Liam (male) | Multiple timbres |
kokoro-am-onyx | Onyx (male) | Multiple timbres |
kokoro-bf-alice | Alice (British female) | Multiple timbres |
kokoro-bf-emma | Emma (British female) | Multiple timbres |
kokoro-bf-isabella | Isabella (British female) | Multiple timbres |
kokoro-bm-george | George (British male) | Multiple timbres |
kokoro-bm-lewis | Lewis (British male) | Multiple timbres |
Binary timbre format
Section titled “Binary timbre format”Each timbre is stored as a set of .f32 binary files containing 32-bit floating-point arrays:
- Harmonic magnitudes — Amplitude values for each harmonic partial across the pitch range
- Spectral envelope — Formant shape curve used to weight harmonic amplitudes
- Noise floor — Broadband noise spectrum for breathiness and consonant texture
These files are loaded at startup and held in memory for zero-latency access during synthesis.
Manifest schema
Section titled “Manifest schema”Each preset directory contains a JSON manifest file describing the voice characteristics:
| Field | Type | Description |
|---|---|---|
name | string | Human-readable preset name |
id | string | Unique preset identifier (used in API calls) |
pitch_range | [min, max] | Supported pitch range in MIDI note numbers |
resonance | object | Formant resonance parameters |
vibrato_defaults | object | Default vibrato rate, depth, and onset delay |
timbres | array | List of available timbres with their binary asset paths |
Inspecting presets
Section titled “Inspecting presets”Use the built-in CLI inspector to examine preset data:
npm run inspectThis prints a table of all loaded presets with their timbre counts, pitch ranges, and file sizes.
You can also query presets through the REST API:
curl http://localhost:4321/api/presetsThe response includes full metadata for every preset, including timbre names and parameter ranges.
Using presets in renders
Section titled “Using presets in renders”When rendering a score (via the cockpit UI or the REST API), specify the preset by its id:
{ "preset": "kokoro-af-heart", "score": { ... }, "polyphony": 8, "seed": 42}The engine resolves the preset, loads its binary timbre data, and uses it for the entire render. Changing the preset produces a different voice while keeping the same score and timing.
Multi-timbre usage
Section titled “Multi-timbre usage”Presets with multiple timbres support real-time timbre morphing via the XY pad in live mode. The X axis interpolates between timbres, blending their spectral characteristics smoothly. In score mode, per-note timbre values select or blend between available timbres.