WebSocket

This documentation is for developers integrating directly with the ElevenLabs WebSocket API. For convenience, consider using the official SDKs provided by ElevenLabs.

The ElevenLabs Conversational AI WebSocket API enables real-time, interactive voice conversations with AI agents. By establishing a WebSocket connection, you can send audio input and receive audio responses in real-time, creating life-like conversational experiences.

Endpoint: wss://api.elevenlabs.io/v1/convai/conversation?agent_id={agent_id}

Authentication

Using Agent ID

For public agents, you can directly use the agent_id in the WebSocket URL without additional authentication:

wss://api.elevenlabs.io/v1/convai/conversation?agent_id=<your-agent-id>

Using a Signed URL

For private agents or conversations requiring authorization, obtain a signed URL from your server, which securely communicates with the ElevenLabs API using your API key.

Example using cURL

Request:

curl -X GET "https://api.elevenlabs.io/v1/convai/conversation/get_signed_url?agent_id=<your-agent-id>" \
     -H "xi-api-key: <your-api-key>"

Response:

{
  "signed_url": "wss://api.elevenlabs.io/v1/convai/conversation?agent_id=<your-agent-id>&token=<token>"
}

Never expose your ElevenLabs API key on the client side.

Communication

Client-to-Server Messages

User Audio Chunk

Send audio data from the user to the server. Format:

{
  "user_audio_chunk": "<base64-encoded-audio-data>"
}

Notes:

Audio Format Requirements:
- PCM 16-bit mono format
- Base64 encoded
- Sample rate of 16,000 Hz
Recommended Chunk Duration:
- Send audio chunks approximately every 250 milliseconds (0.25 seconds)
- This equates to chunks of about 4,000 samples at a 16,000 Hz sample rate
Optimizing Latency and Efficiency:
- Balance Latency and Efficiency: Sending audio chunks every 250 milliseconds offers a good trade-off between responsiveness and network overhead.
- Adjust Based on Needs:
  - Lower Latency Requirements: Decrease the chunk duration to send smaller chunks more frequently.
  - Higher Efficiency Requirements: Increase the chunk duration to send larger chunks less frequently.
- Network Conditions: Adapt the chunk size if you experience network constraints or variability.

Pong Message

Respond to server ping messages by sending a pong message, ensuring the event_id matches the one received in the ping message. Format:

{
  "type": "pong",
  "event_id": 12345
}

Conversation Initiation Client Data Override

Send initial conversation configuration to the server with a conversation_initiation_client_data message, optionally including agent prompt overrides, preferred language, TTS voice settings, and custom LLM parameters that will be used for the conversation. Format:

{
  "type": "conversation_initiation_client_data",
  "conversation_config_override": {
    "agent": {
      "prompt": {
        "prompt": "You are a helpful AI assistant. You are cheerful and friendly."
      },
      "first_message": "Hi! How can I help you today?",
      "language": "en"
    },
    "tts": {
      "voice_id": "pNInz6obpgDQGcFmaJgB"
    }
  },
  "custom_llm_extra_body": {
    "temperature": 0.7,
    "max_tokens": 150
  }
}

Client Tool Result

Respond to server client_tool_call messages by sending a client_tool_result message, ensuring the tool call id matches the one in the received call message.

{
  "type": "client_tool_result",
  "tool_call_id": str,
  "result": str,
  "is_error": bool,
}

Server-to-Client Messages

conversation_initiation_metadata

Provides initial metadata about the conversation. Format:

{
  "type": "conversation_initiation_metadata",
  "conversation_initiation_metadata_event": {
    "conversation_id": "conv_123456789",
    "agent_output_audio_format": "pcm_16000"
  }
}

Other Server-to-Client Messages

Type	Purpose
user_transcript	Transcriptions of the user’s speech
agent_response	Agent’s textual response
audio	Chunks of the agent’s audio response
interruption	Indicates that the agent’s response was interrupted
ping	Server pings to measure latency
client_tool_call	Initiate client tool call

Message Formats

user_transcript:

{
  "type": "user_transcript",
  "user_transcription_event": {
    "user_transcript": "Hello, how are you today?"
  }
}

agent_response:

{
  "type": "agent_response",
  "agent_response_event": {
    "agent_response": "Hello! I'm doing well, thank you for asking. How can I assist you today?"
  }
}

audio:

{
  "type": "audio",
  "audio_event": {
    "audio_base_64": "SGVsbG8sIHRoaXMgaXMgYSBzYW1wbGUgYXVkaW8gY2h1bms=",
    "event_id": 67890
  }
}

interruption:

{
  "type": "interruption",
  "interruption_event": {
    "event_id": 54321
  }
}

internal_tentative_agent_response:

{
  "type": "internal_tentative_agent_response",
  "tentative_agent_response_internal_event": {
    "tentative_agent_response": "I'm thinking about how to respond..."
  }
}

ping:

{
  "type": "ping",
  "ping_event": {
    "event_id": 13579,
    "ping_ms": 50
  }
}

client_tool_call:

{
  "type": "client_tool_call",
  "client_tool_call": {
    "tool_name": string,
    "tool_call_id": string,
    "parameters": dict,
  }
}

Latency Management

To ensure smooth conversations, implement these strategies:

Adaptive Buffering: Adjust audio buffering based on network conditions.
Jitter Buffer: Implement a jitter buffer to smooth out variations in packet arrival times.
Ping-Pong Monitoring: Use ping and pong events to measure round-trip time and adjust accordingly.

Security Best Practices

Rotate API keys regularly and use environment variables to store them.
Implement rate limiting to prevent abuse.
Clearly explain the intention when prompting users for microphone access.
Optimized Chunking: Tweak the audio chunk duration to balance latency and efficiency.

Quickstart

Customization

Libraries & SDKs

API Reference

Authentication

Using Agent ID

Using a Signed URL

Example using cURL

Communication

Client-to-Server Messages

User Audio Chunk

Pong Message

Conversation Initiation Client Data Override

Client Tool Result

Server-to-Client Messages

conversation_initiation_metadata

Other Server-to-Client Messages

Message Formats

Latency Management

Security Best Practices

Additional Resources

Quickstart

Customization

Libraries & SDKs

API Reference

​Authentication

​Using Agent ID

​Using a Signed URL

​Example using cURL

​Communication

​Client-to-Server Messages

​User Audio Chunk

​Pong Message

​Conversation Initiation Client Data Override

​Client Tool Result

​Server-to-Client Messages

​conversation_initiation_metadata

​Other Server-to-Client Messages

Message Formats

​Latency Management

​Security Best Practices

​Additional Resources

Authentication

Using Agent ID

Using a Signed URL

Example using cURL

Communication

Client-to-Server Messages

User Audio Chunk

Pong Message

Conversation Initiation Client Data Override

Client Tool Result

Server-to-Client Messages

conversation_initiation_metadata

Other Server-to-Client Messages

Latency Management

Security Best Practices

Additional Resources