Discussions
Custom Mode Latency: Can LiveAvatar Accept Audio Directly From Backend or Speak From Text?
20 days ago
We are using Custom Mode with LiveAvatar and generating speech using ElevenLabs.
Current flow:
- Text → audio generated via ElevenLabs (server)
- Audio sent from backend to frontend
- Frontend streams audio to LiveAvatar
Because audio is generated first and then passes through backend → frontend → LiveAvatar, this adds noticeable latency.
Questions:
- Is it possible to send the generated audio directly from the backend to LiveAvatar, skipping the frontend?
- Or does Custom Mode support making the avatar speak directly from text (server-side), without us handling audio streaming manually?
We’re trying to reduce end-to-end latency as much as possible.
We don’t want to use Full Mode because we already have our own backend LLM stack, and paying nearly 2× more for features we don’t use doesn’t make sense for us.