Overview of LiveAvatar

LiveAvatars are programmable interfaces that give a human touch to your AI Agents!

Introduction

A LiveAvatar is an AI-powered digital human that can interact with users in real time with both audio and video. Developers can integrate LiveAvatars into their apps or websites to create natural, conversational experiences such as:

  • Real-time product demos or virtual sales assistants
  • AI-powered support or training agents
  • Interactive hosts, tutors, or characters

At the heart of the system is the LiveAvatar Session — a live, persistent connection that allows users to speak or chat with an avatar. The session handles:

  • Streaming in user input (voice or text)
  • Feeding it into a language model (LLM) and generating a response
  • Rendering the response as with synchronized speech and video

This means every conversation feels dynamic, personal, and fully real-time.


LiveAvatar Configuration

When you start a session, three main layers work together:

  1. Avatar (Visual Layer) — Defines what the avatar looks like. Choose from a wide selection of avatars, each with unique styles, appearances, and expressions.
  2. Voice (Audio Layer) — Defines what the avatar sounds like. Choose a voices that fit your needs — from calm and professional to energetic or youthful.
  3. Context (Cognitive Layer) — Defines how the avatar thinks and responds. Control the personality traits, background knowledge, and behavior, which guide how the LLM generates responses.

Each of these layers can be configured when creating a session, giving developers precise control over the avatar’s look, sound, and intelligence.


LiveAvatar Session

The Session is a core piece of the LiveAvatar API — it represents a single, continuous stream of interactions between a user and an avatar. Developers can observe and control interactions through a set of events and callbacks.

Under the hood, the Session manages connectivity, user input, avatar output, and conversational state. Throughout the session’s lifetime, the API emits events that let developer applications stay in sync with what’s happening — whether that’s a user speaking, the avatar responding, or the connection closing.


Next Steps

If you want to learn more about configuring


What’s Next