Discussions

Ask a Question
Back to all

ElevenLabs V3 Conversational: Lip-sync delay/desync

Hi HeyGen team

I’m building a real-time interactive LiveAvatar in Custom mode using ElevenLabs Agent Plugin. The Agent on ElevenLabs is a V3 Agent as the TTS model family for better support of specifically Telugu "te", but the issue is general. The main problem I’m facing is that the lip-sync feels noticeably off / delayed:

The avatar’s mouth movements don’t perfectly align with the audio timing. To make the sync look acceptable, I have to artificially speed up the TTS output (around 1.1–1.3×), but this raises the pitch and makes the voice sound unnatural/chipmunk-like.

For V3 models, the Speed / Stability / Similarity sliders are not customizable (greyed out / hidden).

Questions:

Is there a hidden / beta / API-level way to adjust speed or fine-tune TTS timing for V3 in LiveAvatar Custom mode without pitch distortion?
Are there known workarounds or upcoming fixes for better lip-sync when using external ElevenLabs V3 voices in real-time avatars?
Would using HeyGen’s native voices (even if less ideal for my target language) or a different TTS provider give noticeably better out-of-the-box sync?

Happy to share more config details, screenshots, or a short demo clip if that helps.
Thanks in advance for any advice or insights!