Overview of LiveAvatar

LiveAvatars are programmable interfaces that give a human touch to your AI Agents!

Introduction

A LiveAvatar is an AI-powered digital human that can interact with users in real time with both audio and video. Developers can integrate LiveAvatars into their apps or websites to create natural, conversational experiences such as:

  • Real-time product demos or virtual sales assistants
  • AI-powered support or training agents
  • Interactive hosts, tutors, or characters

At the heart of the system is the LiveAvatar Session — a live, persistent connection that allows users to speak or chat with an avatar. The session handles:

  • Streaming in user input (voice or text)
  • Feeding it into a language model (LLM) and generating a response
  • Rendering the response as with synchronized speech and video

This means every conversation feels dynamic, personal, and fully real-time.


Getting Started with LiveAvatar

We have two main ways to build with LiveAvatar: FULL mode and Custom mode.

  1. FULL Mode: we host and manage the various services need to make the avatar conversation flow, including the ASR, LLM, TTS and most importantly the avatar streaming in a WebRTC room. You get to control the finer details such as avatar, voice and context used, while trusting our provided default configurations.
  2. CUSTOM MODE: we mange the the solely the avatar streaming framework. You'll get to control the WebRTC integration and bring your own STT, LLM, and TTS frameworks. You'll need to manage the orchestration and infrastructure associated.

LiveAvatar Session

The Session is a core piece of the LiveAvatar API — it represents a single, continuous stream of interactions between a user and an avatar. Developers can observe and control interactions through a set of events and callbacks.

Under the hood, the Session manages connectivity, user input, avatar output, and conversational state. Throughout the session’s lifetime, the API emits events that let developer applications stay in sync with what’s happening — whether that’s a user speaking, the avatar responding, or the connection closing.


Next Steps

If you want to learn more, please check out our Quickstart guide next.



What’s Next