Current Position:fig. beginning " AI News

RealtimeSTT: Real-time Speech-to-Text Tool for Low-Latency Streaming Speech Recognition Based on Whisper

2025-01-18

1.5 K

RealtimeSTT is an efficient, low-latency real-time speech-to-text library with advanced speech activity detection and wake word activation. It was developed by Kolja Beigel to support applications that require fast and accurate speech-to-text transcription. Whether it's a voice assistant or an application that requires accurate speech transcription, RealtimeSTT provides excellent performance and ease of use.

RealtimeSTT：实时语音转文字工具，低延迟语音识别-1

Function List

Real-time speech to text: transcribe speech to text in real time for a variety of application scenarios.
Speech Activity Detection: Automatically detects when a user starts and stops speaking, improving transcription accuracy.
Wake-up word activation: Support wake-up word function, users can activate the system by specific words.
Low Latency: Ensure low latency in the speech-to-text process to enhance user experience.
Multi-Platform Support: Compatible with multiple operating systems and platforms for easy integration.
Open source code: Provide complete open source code for developers to carry out secondary development and customization.

Using Help

Installation process

Cloning Project Warehouse:

   git clone https://github.com/KoljaB/RealtimeSTT.git

Go to the project catalog:

   cd RealtimeSTT

Install the dependencies:

   pip install -r requirements.txt

(Optional) Install GPU support:

   pip install -r requirements-gpu.txt

Usage

Start the server

Start the speech-to-text server:

   stt-server

After the server starts, wait for the prompt "speak now".

Client Usage

Start the client and connect to the server:

stt

Once the client is launched, start talking and the system will transcribe the speech to text in real time.

Main function operation flow

real time speech to text

import (data) AudioToTextRecorder Class:

   from RealtimeSTT import AudioToTextRecorder

Defines functions that process text:

   def process_text(text):
print(text)

Starts the recording and processes the text:

   if __name__ == '__main__':
print("Wait until it says 'speak now'")
recorder = AudioToTextRecorder()
while True:
recorder.text(process_text)

Voice Activity Detection

The system automatically detects when the user starts and stops talking, with no additional configuration required.

wake-up call activation

Configure the wake-up word function so that users can activate the system with specific words, please refer to the project documentation for specific configuration.

Detailed operation examples

Typing everything that is said

import (data) AudioToTextRecorder cap (a poem) pyautogui::

   from RealtimeSTT import AudioToTextRecorder
import pyautogui

Defines functions that process text:

   def process_text(text):
pyautogui.typewrite(text + " ")

Starts the recording and processes the text:

   if __name__ == '__main__':
print("Wait until it says 'speak now'")
recorder = AudioToTextRecorder()
while True:
recorder.text(process_text)

AI open source project AI Speech to Text

May not be reproduced without permission:Chief AI Sharing Circle " RealtimeSTT: Real-time Speech-to-Text Tool for Low-Latency Streaming Speech Recognition Based on Whisper

RealtimeSTT: Real-time Speech-to-Text Tool for Low-Latency Streaming Speech Recognition Based on Whisper

Function List

Using Help

Installation process

Usage

Start the server

Client Usage

Main function operation flow

real time speech to text

Voice Activity Detection

wake-up call activation

Detailed operation examples

Typing everything that is said

Related articles

Recommended

Can't find AI tools? Try here!

Recommended Tools

New Releases

RealtimeSTT: Real-time Speech-to-Text Tool for Low-Latency Streaming Speech Recognition Based on Whisper

Function List

Using Help

Installation process

Usage

Start the server

Client Usage

Main function operation flow

real time speech to text

Voice Activity Detection

wake-up call activation

Detailed operation examples

Typing everything that is said

Related articles

Recommended

Can't find AI tools? Try here!

Recommended Tools

New Releases

Quick query station AI tool