#计算机科学#Real-Time-Voice-Cloning 是一个基于深度学习的语音合成工具,5秒内即可克隆一个声音。
#大语言模型#Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组
PaddleSpeech 是基于飞桨 PaddlePaddle 的语音方向的开源模型库,用于语音和音频中的各种关键任务的开发,典型的应用包括:语音识别、语音翻译、语音合成等
Generate audiobooks from e-books, voice cloning & 1107+ languages!
#计算机科学#YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isola...
A simple, high-quality voice conversion tool focused on ease of use and performance.
#计算机科学#A Python/Pytorch app for easily synthesising human voices
A webui for different audio related Neural Networks
Official MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
The code for the bark-voicecloning model. Training and inference.
singing voice change based on whisper, and lora for singing voice clone
PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN)
#计算机科学#Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2
#大语言模型#Turn PDFs and EPUBs into audiobooks, subtitles or videos into dubbed videos (including translation), and more. For free. Pandrator uses local models, notably XTTS, including voice-cloning (instant, RV...
Self-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), predefined voices, voice cloning, and large audiobook-scale te...