Academy decor
Voice

What are voice models

HeyGen uses advanced voice engines to power avatars with natural speech, emotional depth, and authentic localization. Each engine is built for a specific purpose. Some prioritize professional realism, others focus on expressive delivery, and others specialize in accurate global accents. Understanding what these engines do, and when to use each one, helps you get stronger and more consistent results from your avatars.

A voice engine is the AI system behind synthetic speech. It turns text into audio and shapes qualities like pacing, tone, pronunciation, and emotion. Some engines are trained on recordings of real people, allowing them to closely replicate a specific voice. Others rely on broad multilingual datasets or models designed for expressive range. In combination with tools like Voice Director and Voice Mirroring, voice engines are what make avatars sound more human, more natural, and more adaptable across different use cases.

HeyGen gives you several engine options. ElevenLabs is the default for custom voices and works well for most projects. It provides a strong mix of clarity, realism, and emotional control. For more tailored needs, you can try alternatives like Panda or Starfish. Each brings its own strengths, whether that’s wider expressive range, smoother localization, or broader accent support. The best approach is to experiment and choose the engine that matches your script and your audience.

Switching engines is straightforward. In AI Studio, open your project and go to the Script Panel. Select your current voice, and you’ll see a dropdown with the available engines. Pick the one you want, and it applies immediately to the scene you’re working on. If you want the entire video to use the same voice setup, choose the option to apply the voice to all scenes.

There are a few constraints to keep in mind. You can switch engines for all custom voices created directly in HeyGen, as well as for certain public voices. However, professional voice clones and some third-party voices can’t be reassigned to a different engine. Those voices always remain tied to their original model.

By understanding how each engine works, you can choose the one that best supports your script, your goals, and the experience you want to deliver. Whether you’re aiming for polished professionalism, expressive storytelling, or accurate global localization, the right engine helps your avatars speak in a way that feels natural and aligned to your message.

HeyGen