Skip to content

Voices and Emotions

Original Voice Soundboard ships with 54+ voices and 19 emotions out of the box. You can also describe the style you want in plain English.

Kokoro provides voices across multiple accents and genders. Voice names follow the pattern {accent}{gender}_{name}:

  • af_ — American female
  • am_ — American male
  • bf_ — British female
  • bm_ — British male
from voice_soundboard import VoiceEngine
engine = VoiceEngine()
# American female
result = engine.speak("Hi there!", voice="af_bella")
# British male
result = engine.speak("Cheerio!", voice="bm_george")

Presets bundle a voice, speed, and style into a single name for common use cases:

PresetVoiceSpeedStyle
assistantaf_bella1.0Friendly, conversational
narratorbm_george0.95Calm, documentary
announceram_michael1.1Bold, energetic
storytellerbf_emma0.9Expressive, varied
whisperaf_nicole0.85Soft, gentle
result = engine.speak("Breaking news!", preset="announcer")
result = engine.speak("Once upon a time...", preset="storyteller")

19 emotions are available for fine-grained tonal control:

happy, sad, angry, excited, calm, fearful, surprised, disgusted, contemptuous, tender, proud, ashamed, guilty, anxious, nostalgic, hopeful, determined, confused, amused

result = engine.speak("I can't believe it!", emotion="surprised")
result = engine.speak("We will make it.", emotion="determined")
result = engine.speak("Those were the days.", emotion="nostalgic")

Instead of picking a preset or emotion by name, describe the style you want in plain English:

result = engine.speak("Good morning!", style="warmly and cheerfully")
result = engine.speak("The results are in.", style="serious, measured, like a news anchor")
result = engine.speak("I missed you.", style="soft and a little sad")

The engine maps your description to the closest combination of voice parameters.

You can layer voice, emotion, and style together:

result = engine.speak(
"Welcome to the show!",
voice="am_michael",
emotion="excited",
style="bold and energetic"
)

Presets serve as a convenient starting point, but direct voice + emotion + style gives you the most control.