voice-cloning · GitHub Topics

CorentinJ / Real-Time-Voice-Cloning

#计算机科学#Real-Time-Voice-Cloning 是一个基于深度学习的语音合成工具，5秒内即可克隆一个声音。

深度学习 PyTorch Tensorflow tts 声音克隆 Python

Python 54.92 k

3 个月前

RVC-Boss / GPT-SoVITS

强大的少样本语音转换与语音合成Web用户界面。

text-to-speech tts vits voice-clone voice-cloneai 声音克隆

Python 50.46 k

1 个月前

coqui-ai / TTS

#计算机科学#🐸💬 - 一个深度学习的 TTS 语言合成库

Python text-to-speech 深度学习 speech PyTorch tts vocoder tacotron glow-tts melgan speaker-encoder hifigan speaker-encodings multi-speaker-tts tts-model speech-synthesis 声音克隆 voice-synthesis voice-conversion

Python 42.32 k

1 年前

FunAudioLLM / CosyVoice

#大语言模型#Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

audio-generation gpt-4o text-to-speech tts cantonese 聊天机器人 ChatGPT 中文 english fine-grained fine-tuning japanese korean multi-lingual natural-language-generation Python cosyvoice cross-lingual 声音克隆

Python 16.03 k

9 天前

Huanshere / VideoLingo

Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音，一键全自动视频搬运AI字幕组

ai-translation dubbing Localization (l10n)video-translation 声音克隆

Python 14.88 k

3 个月前

PaddlePaddle / PaddleSpeech

PaddleSpeech 是基于飞桨 PaddlePaddle 的语音方向的开源模型库，用于语音和音频中的各种关键任务的开发，典型的应用包括：语音识别、语音翻译、语音合成等

transformer conformer speech-translation streaming-asr speech-alignment punctuation-restoration streaming-tts speech-synthesis tts asr speech-recognition 声音克隆 vocoder voice-recognition self-supervised-learning Whisper

Python 12.2 k

15 天前

DrewThomasson / ebook2audiobook

Generate audiobooks from e-books, voice cloning & 1107+ languages!

audiobooks Docker epub Linux macOS tts Windows xtts 声音克隆 gradio 中文 english multilingual colab-notebook kaggle audiobook

Python 11.16 k

8 天前

multimodal-art-projection / YuE

#计算机科学#YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open

foundation-models music-generation huggingface llama audio-generation 声音克隆大语言模型人工智能深度学习 gpt

Python 5.44 k

3 个月前

abus-aikorea / voice-pro

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isola...

faster-whisper tts Whisper gradio subtitles transcription translator webui speech-recognition speech-synthesis speech-to-text text-to-speech yt-dlp 声音克隆 podcasts audiobook voice-conversion karaoke whisperx

Python 4.52 k

1 个月前

IAHispano / Applio

A simple, high-quality voice conversion tool focused on ease of use and performance.

rvc vc vits voice 人工智能声音克隆 voice-conversion applio voice-clone PyTorch speech text-to-speech tts

Python 2.55 k

2 天前

voice-cloning-app / Voice-Cloning-App

#计算机科学#A Python/Pytorch app for easily synthesising human voices

Python tts text-to-speech PyTorch 深度学习声音克隆 tacotron2

Python 1.45 k

9 个月前

gitmylo / audio-webui

A webui for different audio related Neural Networks

人工智能 audioldm bark rvc text-to-audio text-to-speech 声音克隆 audiocraft music generative-music tts aio all-in-one

Python 1.2 k

3 个月前

MiniMax-AI / MiniMax-MCP

Official MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.

image-generation mcp mcp-server mcp-tools text-to-speech video-generation image-to-video text-to-image text-to-video 声音克隆

Python 923

2 个月前

Tomiinek / Multilingual_Text_to_Speech

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

text-to-speech speech-synthesis multilingual tts 声音克隆

Python 838

2 年前

gitmylo / bark-voice-cloning-HuBERT-quantizer

The code for the bark-voicecloning model. Training and inference.

人工智能 neural-networks text-to-speech 声音克隆 voice-conversion

Python 704

2 年前

PlayVoice / lora-svc

singing voice change based on whisper, and lora for singing voice clone

singing-voice-conversion voice-conversion vits 声音克隆 Whisper lora

Python 643

2 年前

PaddlePaddle / Parakeet

PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN)

text-to-speech speech-synthesis tacotron2 fastspeech2 multi-speaker-tts 声音克隆

Python 616

4 年前

jackaduma / CycleGAN-VC2

#计算机科学#Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2

voice-conversion cyclegan Generative Adversarial Network 深度学习声音克隆 pytorch-implementation speech-synthesis pix2pix aigc

Python 560

2 年前

lukaszliniewicz / Pandrator

#大语言模型#Turn PDFs and EPUBs into audiobooks, subtitles or videos into dubbed videos (including translation), and more. For free. Pandrator uses local models, notably XTTS, including voice-cloning (instant, RV...

audiobook audiobooks text-processing text-to-speech 大语言模型 rvc tkinter-gui xtts 声音克隆 dubbing voice-clone

Python 488

4 个月前

devnen / Chatterbox-TTS-Server

Self-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), predefined voices, voice cloning, and large audiobook-scale te...

人工智能 API audio-generation CUDA FastAPI huggingface openai-api Python PyTorch speech-synthesis text-to-speech tts tts-api 声音克隆 web-ui rocm

Python 470

1 个月前