The ElevenLabs API provides the ability to stream responses back to a client in order to allow partial results for certain requests. To achieve this, we follow the Server-sent events standard. Our official Node and Python libraries include helpers to make parsing these events simpler.
Streaming is supported for the Text to Speech API, Voice Changer API & Audio Isolation API. This section focuses on how streaming works for requests made to the Text to Speech API.
In Python, a streaming request looks like:
from elevenlabs import stream
from elevenlabs.client import ElevenLabs
client = ElevenLabs()
audio_stream = client.text_to_speech.convert_as_stream(
text="This is a test",
voice_id="JBFqnCBsd6RMkjVDRZzb",
model_id="eleven_multilingual_v2"
)
# option 1: play the streamed audio locally
stream(audio_stream)
# option 2: process the audio bytes manually
for chunk in audio_stream:
if isinstance(chunk, bytes):
print(chunk)
In Node / Typescript, a streaming request looks like:
import { ElevenLabsClient, stream } from "elevenlabs";
import { Readable } from "stream";
const client = new ElevenLabsClient();
async function main() {
const audioStream = await client.textToSpeech.convertAsStream(
"JBFqnCBsd6RMkjVDRZzb",
{
text: "This is a test",
model_id: "eleven_multilingual_v2",
}
);
// option 1: play the streamed audio locally
await stream(Readable.from(audioStream));
// option 2: process the audio manually
for await (const chunk of audioStream) {
console.log(chunk);
}
}
main();