Current Position:fig. beginning " AI Tool Library

Ichigo (llama3-s): local real-time voice AI assistant, open source version of Siri

2024-11-12

1.2 K

General Introduction

Ichigo is an open source, real-time speech AI project that aims to extend text-based language models with native "listening" capabilities. The project uses early fusion techniques inspired by Meta's Chameleon paper.Ichigo aims to be an open-source data, open-weighted, native-device voice assistant, similar to Siri.The project is open for partners to join in the crowdsourcing of speech datasets.

Function List

Real-time speech recognition: The ability to process and understand user voice input in real time.
multicast dialogue capability: Supports multiple rounds of dialog and is able to maintain context in a conversation.
noise management: The ability to refuse to process non-speech audio inputs through training improves the user experience.
Open source and scalable: The project code and model weights are completely open source and users are free to download and extend them.
local deployment: Supports deployment on local devices to protect user privacy.

Using Help

Installation process

environmental preparation ::
- Ensure that Python 3.8 or above is installed.
- Install the necessary dependency libraries:pip install -r requirements.txtThe

Download model ::

Use the following command to download the Ichigo model:

git clone https://github.com/homebrewltd/ichigo.git
cd ichigo
pip install -e .

Configuring the dataset ::
- Download the required dataset from HuggingFace and set the dataset path in the configuration file.
Launch Demo ::
- Start the local Gradio Demo with the following command:
```
python demo.py --use-4bit --use-8bit
```

Usage Process

Starting services ::
- After running the above command, visit the locally provided URL to access Ichigo's Web UI interface.
voice input ::
- In the Web UI interface, click the microphone icon to start recording, and the system will process and display the speech recognition results in real time.
many rounds of dialogue ::
- The system supports multiple rounds of dialog, where the user can continuously input speech and the system will maintain the context to understand and respond.
noise management ::
- The system is trained to recognize and reject the processing of non-speech audio inputs to ensure the accuracy of the recognition results.
Custom extensions ::
- Users can modify the code and model as needed to add new features or improve existing ones.

Detailed Operation Procedure

Download and Installation ::
- Visit Ichigo's GitHub page and follow the installation process to download and install the necessary dependencies and models.
Configuration and startup ::
- According to the configuration file provided by the project, set the dataset path and model parameters to start the local service.
Using the Web UI ::
- Experience Ichigo's real-time speech recognition and multi-round dialog features by performing voice input and interaction through the Web UI interface.
Extension and customization ::
- Understand the architecture and workings of the system based on project documentation and code comments for custom extensions.

AI open source project Multimodal real-time interactive products

May not be reproduced without permission:AI productivity tools " Ichigo (llama3-s): local real-time voice AI assistant, open source version of Siri

Ichigo (llama3-s): local real-time voice AI assistant, open source version of Siri

General Introduction

Function List

Using Help

Installation process

Usage Process

Detailed Operation Procedure

Related articles

Recommended

Can't find AI tools? Try here!

testimonials

newest

Ichigo (llama3-s): local real-time voice AI assistant, open source version of Siri

General Introduction

Function List

Using Help

Installation process

Usage Process

Detailed Operation Procedure

Related articles

Recommended

Can't find AI tools? Try here!

testimonials

newest

Quick query station AI tool