When to use PTT
- Users with non-linear or thoughtful conversational patterns
- Scenarios where short pauses shouldn’t signal speech completion
- Applications requiring precise control over speech boundaries
- Custom UI/UX (press-and-hold buttons, toggles, keyboard shortcuts)
How it works
User speaks
The user talks freely — pauses, corrections, and hesitations are all captured. No response is generated yet.
Setup
Setinteractivity_type to PUSH_TO_TALK when creating the session token:
Events
Commands (you send)
| Event | Description |
|---|---|
user.start_push_to_talk | Signal that the user is beginning to speak |
user.stop_push_to_talk | Signal that the user has finished speaking |
Responses (you receive)
| Event | Description |
|---|---|
user.push_to_talk_started | PTT successfully started |
user.push_to_talk_start_failed | PTT failed to start |
user.push_to_talk_stopped | PTT successfully stopped |
user.push_to_talk_stop_failed | PTT failed to stop |
UI recommendations
Map PTT controls to familiar patterns:- Press-and-hold button (walkie-talkie style)
- Toggle switch (tap to start, tap to stop)
- Keyboard shortcut (spacebar hold)
- Touch or controller input for mobile/VR