Discussions

Ask a Question
Back to all

Custom LiveAvatar + ElevenLabs: No Real-Time Interruption Handling

Problem:
Custom LiveAvatar mode supports ElevenLabs voice integration but lacks real-time interruption handling, unlike full mode.

Current Flow (High Latency, No Interruption):

  • User speech → ElevenLabs WebSocket STT → text
  • Text → LLM → response text
  • Response text → ElevenLabs TTS → audio
  • Audio → HeyGen Custom Avatar API → final output

Issues:

  • Multiple sequential API calls create high latency
  • No interruption support - must wait for first conversation cycle to complete before second input can be processed
  • Feels like batch processing, not real-time voice-to-voice interaction

Full Mode Behavior (Works Well):

  • Native real-time interruption handling
  • But doesn't support ElevenLabs voice

Expectation:
Custom LiveAvatar + ElevenLabs voice with real-time interruption handling (like full mode)

Question:

  1. How to achieve real-time interruption handling in Custom LiveAvatar mode while using ElevenLabs voice? Is there a streaming/parallel processing approach to eliminate sequential API waits?
  2. Is Custom LiveAvatar mode planned for deprecation?

Tech Stack: ElevenLabs STT/TTS WebSocket + LLM + HeyGen Custom LiveAvatar API