Usage
Quick Start
Here's a basic usage example:
from RealtimeTTS import TextToAudioStream, SystemEngine, AzureEngine, ElevenlabsEngine
engine = SystemEngine() # replace with your TTS engine
stream = TextToAudioStream(engine)
stream.feed("Hello world! How are you today?")
stream.play_async()
Feed Text
You can feed individual strings:
stream.feed("Hello, this is a sentence.")
Or you can feed generators and character iterators for real-time streaming:
def write(prompt: str):
for chunk in openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content" : prompt}],
stream=True
):
if (text_chunk := chunk["choices"][0]["delta"].get("content")) is not None:
yield text_chunk
text_stream = write("A three-sentence relaxing speech.")
stream.feed(text_stream)
char_iterator = iter("Streaming this character by character.")
stream.feed(char_iterator)
Playback
Asynchronously:
stream.play_async()
while stream.is_playing():
time.sleep(0.1)
Synchronously:
stream.play()
Testing the Library
The test subdirectory contains a set of scripts to help you evaluate and understand the capabilities of the RealtimeTTS library.
Note that most of the tests still rely on the "old" OpenAI API (<1.0.0). Usage of the new OpenAI API is demonstrated in openai_1.0_test.py.
-
simple_test.py
- Description: A "hello world" styled demonstration of the library's simplest usage.
-
complex_test.py
- Description: A comprehensive demonstration showcasing most of the features provided by the library.
-
coqui_test.py
- Description: Test of local coqui TTS engine.
-
translator.py
- Dependencies: Run
pip install openai realtimestt
. - Description: Real-time translations into six different languages.
- Dependencies: Run
-
openai_voice_interface.py
- Dependencies: Run
pip install openai realtimestt
. - Description: Wake word activated and voice based user interface to the OpenAI API.
- Dependencies: Run
-
advanced_talk.py
- Dependencies: Run
pip install openai keyboard realtimestt
. - Description: Choose TTS engine and voice before starting AI conversation.
- Dependencies: Run
-
minimalistic_talkbot.py
- Dependencies: Run
pip install openai realtimestt
. - Description: A basic talkbot in 20 lines of code.
- Dependencies: Run
-
simple_llm_test.py
- Dependencies: Run
pip install openai
. - Description: Simple demonstration of how to integrate the library with large language models (LLMs).
- Dependencies: Run
-
test_callbacks.py
- Dependencies: Run
pip install openai
. - Description: Showcases the callbacks and lets you check the latency times in a real-world application environment.
- Dependencies: Run
Pause, Resume & Stop
Pause the audio stream:
stream.pause()
Resume a paused stream:
stream.resume()
Stop the stream immediately:
stream.stop()
Requirements Explained
- Python Version:
- Required: Python >= 3.9, < 3.13
-
Reason: The library depends on the GitHub library "TTS" from coqui, which requires Python versions in this range.
-
PyAudio: to create an output audio stream
-
stream2sentence: to split the incoming text stream into sentences
-
pyttsx3: System text-to-speech conversion engine
-
pydub: to convert audio chunk formats
-
azure-cognitiveservices-speech: Azure text-to-speech conversion engine
-
elevenlabs: Elevenlabs text-to-speech conversion engine
-
coqui-TTS: Coqui's XTTS text-to-speech library for high-quality local neural TTS
Shoutout to Idiap Research Institute for maintaining a fork of coqui tts.
-
openai: to interact with OpenAI's TTS API
-
gtts: Google translate text-to-speech conversion