AI voice cloning

 Submit Website

Conch Speech (MiniMax Audio): AI tool for generating natural speech
MiniMax Audio is an AI speech generation tool from MiniMax, with the core feature of quickly converting text into highly similar natural speech. It is based on the Speech-02 model, with a speech synthesis similarity of up to 99%, studio-grade sound quality, and support for more than 30 languages and a wide range of mouth...
04-08 1.0 K1kudos
MegaTTS3: A Lightweight Model for Synthesizing Chinese and English Speech
MegaTTS3 is an open source speech synthesis tool developed by ByteDance in cooperation with Zhejiang University, focusing on generating high-quality Chinese and English speech. Its core model is only 0.45B parameters , lightweight and efficient , support for mixed Chinese and English speech generation and speech cloning . The project is hosted on GitHub, providing code and...
03-29 9100kudos
Seed-VC: supports real-time conversion of speech and song with fewer samples
Seed-VC is an open source project on GitHub, developed by Plachtaa. It can use a piece of reference audio from 1 to 30 seconds to quickly realize the voice or song conversion, without additional training. The project supports real-time voice conversion with a latency as low as 400 milliseconds , suitable for online meetings , games ...
03-19 7480kudos
CSM Voice Cloning: Fast Voice Cloning with the CSM-1B
CSM Voice Cloning is an open source project developed by Isaiah Bjork and hosted on GitHub. It is based on the Sesame CSM-1B model, which allows users to clone their own voice and generate a voice with their own personal characteristics by simply providing an audio sample. This tool supports this ...
03-18 7450kudos
PlayHT: an AI tool for generating hyper-realistic speech
PlayHT is an efficient online platform focusing on AI speech generation, helping users quickly convert text into natural, realistic speech. It provides more than 600 AI voices, supports more than 60 languages and diverse accents, and is suitable for a variety of scenarios such as podcast production, educational content, marketing and promotion. Users only need to input...
03-04 8930kudos
Spark-TTS: A Text-to-Speech Tool for Generating Natural Speech
Spark-TTS is an open source Text-to-Speech (TTS) tool developed by the SparkAudio team, hosted on GitHub, designed to help users efficiently convert text into natural and smooth speech. It is based on advanced deep learning technology and supports multiple languages and voice styles...
03-03 1.0 K0kudos
Step-Audio
Step-Audio is an open source intelligent speech interaction framework designed to provide out-of-the-box speech understanding and generation capabilities for production environments. The framework supports multi-language dialog (e.g., Chinese, English, Japanese), emotional speech (e.g., happy, sad), regional dialects (e.g., Cantonese, Szechuan), adjustable speech rate...
02-19 1.2 K0kudos
Zonos: High Quality Speech Synthesis and Speech Cloning Tools
Zonos is an open source speech synthesis and speech cloning tool developed by Zyphra.The Zonos-v0.1 version employs an advanced Transformer and blending model to generate high-quality speech output. The tool supports multiple languages, including English, Japanese, Chinese, French and German, and provides fine...
02-12 1.5 K0kudos
Weights: a voice-imitation cover song and text-to-speech authoring platform
Weights is a social platform that utilizes AI for creation, allowing users to create voice covers, text-to-speech, images, music, and videos with simple actions. The platform provides a wealth of tools and templates to help users get started quickly and share their work with the community.Weights ...
01-30 1.3 K0kudos
AnyVoice: free online voice cloning, just 3 seconds to realize the voice cloning
AnyVoice is an advanced AI speech generation platform that provides ultra-realistic speech generation and voice cloning services. The platform allows users to convert text into natural speech and choose from hundreds of preset voices. If you can't find the right voice, just 3 seconds of recording can be free gr...
01-30 1.4 K0kudos
Llasa 1~8B: an open source text-to-speech model for high quality speech generation and cloning
Llasa-3B is an open source text-to-speech (TTS) model developed by the Audio Lab of the Hong Kong University of Science and Technology (HKUST Audio). The model is based on the Llama 3.2B architecture, which has been carefully tuned to provide high-quality speech generation that not only supports multiple languages, but also enables emotional expression and personalized speech g...
01-27 1.4 K0kudos
Fish Agent
Fish Speech Derivative Project Fish Agent is a revolutionary end-to-end AI speech cloning system developed based on V0.1 3B model architecture. As a fully end-to-end speech cloning processing system, its most important feature is that it adopts an innovative semantic tagless architecture design, which does not need to rely on the traditional language such as Whisper .....
01-03 1.6 K0kudos
ViiTor AI: Audio/Video Multilingual Translation Synthesis and Speech Cloning Service
ViiTor AI is a powerful artificial intelligence platform focused on providing high-quality video translation, voice cloning, AI-generated avatar videos, and speech synthesis services. The platform supports multiple languages and is designed to help users easily realize multilingual content creation.ViiTor AI's video translation feature can...
12-26 1.5 K0kudos
Voicemod: real-time voice changer, voice chat, game voice change
Voicemod is a leading real-time voice changer and sound effects software for Windows and macOS. Whether you are role-playing in a game, chatting with friends, or live-streaming, Voicemod provides you with a rich variety of voice changing effects. With AI technology, Voicemod is able to real-time...
11-30 1.6 K0kudos
Amphion MaskGCT: Zero-sample text-to-speech cloning model (local one-click deployment package)
MaskGCT (Masked Generative Codec Transformer) is a completely non-autoregressive Text-to-Speech (TTS) model jointly introduced by Funky Maru Technology and The Chinese University of Hong Kong. The model does not require explicit text-to-speech alignment information, and adopts a two-stage generation approach, first through text pre...
10-29 1.9 K0kudos
Fukumaru Chione
Funmaru Thousand Voices is a multilingual AI voice synthesis platform that provides realistic and natural voice generation solutions. Users can easily convert text content into professional-grade audio and support the creation of exclusive AI voices (voice clones) from zero samples to meet personalized needs. The platform also provides video translation function to help users realize...
10-29 1.4 K0kudos
CosyVoice: 3-second rush voice cloning open source project launched by Ali with support for emotionally controlled tags
CosyVoice is a multilingual large-scale speech generation model that provides full-stack capabilities from inference, training to deployment. Developed by FunAudioLLM team, it aims to achieve high quality speech synthesis through advanced autoregressive transformers and ODE-based diffusion models.CosyVoice not only supports multilingual...
10-24 2.5 K0kudos
Conch AI video generator: text or image to generate high-quality video, movie and TV-grade footage creation
Conch AI Video Generator is an advanced AI video generation tool developed by MiniMax. Users only need to provide a simple text description or upload images, and Conch AI can quickly generate high-quality video content. The tool is widely used by creators, marketers and storytellers to help them bring...
10-17 1.7 K0kudos
Coqui TTS (xTTS): Deep Learning Toolkit for Text-to-Speech Generation with Multiple Language Support and Voice Cloning Capabilities
Comprehensive Introduction Coqui TTS is an open source advanced text-to-speech (TTS) generation toolkit based on deep learning techniques. It has been battle-tested in both research and production environments, and provides a rich set of features and models that support text-to-speech conversion in multiple languages.Coqui TTS not only supports pre-trained models...
10-17 1.7 K0kudos