Every LITE Mode session follows three phases: starting, managing, and ending.Documentation Index
Fetch the complete documentation index at: https://docs.liveavatar.com/llms.txt
Use this file to discover all available pages before exploring further.
1. Starting a Session
- Generate a session token configured for LITE Mode on your backend
- Start the session using that token
- The avatar streams into the specified WebRTC room after initialization
2. Managing the Session
LITE Mode provides a WebSocket connection for controlling the avatar. The typical flow:- User speaks — audio is sent to the room
- Your agent processes — your STT/LLM/TTS pipeline handles the input
- Agent constructs response audio — your TTS generates the speech
- Agent streams audio via WebSocket — send audio chunks to LiveAvatar
- LiveAvatar renders video — avatar video frames are sent to the room
Audio format
Audio sent to LiveAvatar must be PCM 16-bit 24KHz, Base64-encoded. Recommended chunk size is ~1 second, with a maximum of 1MB per WebSocket packet.Latency
LiveAvatar generates avatar video in real time as audio arrives — there is no batch processing or queuing delay on the LiveAvatar side.- Plugin path — end-to-end latency depends on your pipeline (STT, LLM, TTS, and network). LiveAvatar adds minimal overhead on top of your existing stack.
- Connector path — LiveAvatar manages the full connection to your voice agent and optimizes latency on your behalf.
WebSocket commands
Through the WebSocket, you can:- Command the avatar to speak (by sending audio)
- Interrupt avatar responses
- Modify avatar poses (listening, idle)
- Keep sessions alive
3. Ending the Session
When a session ends:- The avatar is removed from the LiveKit room
- The room is torn down (if created by LiveAvatar)
- The WebSocket connection closes