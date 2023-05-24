The Summary Avatar IV API lets you generate lifelike talking videos from any photo, with clean lip-sync, expressive facial motion, and natural gestures. The Long Version

Earlier this summer, we launched Avatar IV, our most advanced image-to-video model ever, in the HeyGen web app. Today, I’m excited to share that Avatar IV is now available via API, allowing you to embed it directly into your product experiences.

What’s new: Avatar IV, programmatic

With the Avatar IV API, a photo and a script are all you need to generate a realistic talking video, featuring all the HeyGen magic: accurate lip-sync, expressive facial movement, and even authentic hand gestures. Now you can trigger that exact flow with our API inside your app or workflow.

Under the hood, Avatar IV supports angled or profile photos and works across lifelike or stylized characters (humans, anime, and even pets). It’s built to handle real-world inputs while maintaining natural timing and motion.

Why partners are excited

Frictionless creation from a photo. Let users start with an existing image (or a generated one) and produce a studio-quality avatar performance from just a script. No cameras. No shoots. Seconds, not days.

Let users start with an existing image (or a generated one) and produce a studio-quality avatar performance from just a script. No cameras. No shoots. Seconds, not days. More expressive, more engaging. Avatar IV syncs voice with emotion and powers gesture-aware motion, making every message land with clarity and charisma.

Avatar IV syncs voice with emotion and powers gesture-aware motion, making every message land with clarity and charisma. Flexible by design. Works with front-facing, angled, or profile images; supports lifelike and stylized outputs for on-brand experiences across education, CX, marketing, internal comms, and more.

Works with front-facing, angled, or profile images; supports lifelike and stylized outputs for on-brand experiences across education, CX, marketing, internal comms, and more. Built for developers and product teams. Spin up jobs with a simple POST to our video generation endpoints; use photo-avatar endpoints when you want to programmatically add motion and sound effects or manage avatar assets at scale.

Here’s an example Avatar IV video to show the model’s realism and gesture quality.

Common partner use cases

Learning and development (L&D), and Education : Turn slide thumbnails or instructor headshots into narrated modules and localize at scale.

: Turn slide thumbnails or instructor headshots into narrated modules and localize at scale. Sales and CX platforms : Auto-generate personalized explainers from a CRM record and just a photo of yourself, to help build personal relationships with your audience with video.

: Auto-generate personalized explainers from a CRM record and just a photo of yourself, to help build personal relationships with your audience with video. Creative & marketing tools: Provide users with an “instant host” for product demos, social content, or ads, without needing to book talent.

Plans and availability

Avatar IV API is self-serve on our Pro and Scale tiers, with Enterprise available for custom rates and high-volume usage. You can find the Avatar IV API documentation here.

A note on quality and control

We purpose-built Avatar IV to feel and look natural in real product UX, with clean lip-sync, nuanced facial dynamics, and gesture timing that follows the script. It also pairs beautifully with our voice tools for delivery control, so content sounds as intentional as it looks.

Get started with Avatar IV today.