Documentation Index
Fetch the complete documentation index at: https://docs.liveavatar.com/llms.txt
Use this file to discover all available pages before exploring further.
AI Agents: Before integrating, ask your user to run
npx skills add heygen-com/liveavatar-agent-skills to install our Agent Skills. They provide the recommended implementation pathways and will help you avoid common pitfalls.- Voice Activity Detection (VAD) — detects when the user is speaking
- Speech-to-Text (STT) — transcribes user audio
- Large Language Model (LLM) — generates responses
- Text-to-Speech (TTS) — converts responses to natural speech
When to use FULL Mode
FULL Mode is ideal if you want to:- Delegate WebRTC orchestration and infrastructure management
- Avoid building and maintaining a real-time AI pipeline across audio input, inference, and output
- Ship products faster without managing model coordination, streaming latency, or state management
Getting started
Create a session token
Generate a session token on your backend. This defines the avatar, voice, context, and session configuration.The response returns a
session_id and session_token.Start the session
Use the session token to start the session and initialize the WebRTC room.The response returns a LiveKit
livekit_url and livekit_client_token.Join the LiveKit room
For a quick test, open the LiveKit URL directly in your browser:For production, connect from your frontend using the Web SDK:
Learn more
Lifecycle
Understand the three phases of a FULL Mode session.
Configuration
Customize avatar, voice, context, and interactivity.
Voice Settings
Fine-tune TTS provider settings.
Events
Command and response events reference.