How to convert text to speech via websocket and save to mp3
elevenlabs-latency
via npm
and follow the instructions here.
.env
file in your project directory and fill it with your credentials like so:
text-to-speech-websocket.py
for Python or text-to-speech-websocket.ts
for Typescript.
output
directory.
flush=true
to clear out the buffer and force generate any buffered text. This can be useful, for example, when you have reached the end of a document and want to generate audio for the final section.
In addition, closing the websocket will automatically force generate any buffered text.
chunk_length_schedule
in generation_config
. Avoid using try_trigger_generation
as it is deprecated.flush=true
along with the text at the end of conversation turn to ensure timely audio generation.chunk_length_schedule
. However, be mindful that reducing latency through this adjustment may come at the expense of quality.chunk_length_schedule
, you can use flush=true
to clear out the buffer and force generate any buffered text." "
. Please note that this string must include a space, as sending a fully empty string, ""
, will close the websocket.alignment
to get the word-level timestamps for each word in the text. This can be useful for aligning the audio with the text in a video or for other applications that require precise timing.