#大语言模型#Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.
#大语言模型#ChatTTS是专门为对话场景设计的文本转语音模型,例如LLM助手对话任务。它支持英文和中文两种语言
#计算机科学#🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
Instant voice cloning by MIT and MyShell. Audio foundation model.
A TTS model capable of generating ultra-realistic dialogue in one pass.
🧠 Leon is your open-source personal assistant.
#大语言模型#Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,同时支持语音识别转录、语音合成、字幕翻译。
#计算机科学#:robot: 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
#计算机科学#EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
微软VALL-E X 零样本语音合成模型的开源实现
#计算机科学#VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
#安卓#Sherpa-ONNX 是一个轻量级语音识别框架, 基于 Kaldi 和 onnxruntime,无需联网即可实现语音转文本、文本转语音、说话人分离以及语音活动检测(VAD)。支持嵌入式系统、安卓、iOS、鸿蒙系统、树莓派、RISC-V、x86_64 服务器、WebSocket 服务器 / 客户端,以及 C/C++、Python、Kotlin、C#、Go、NodeJS、Java、Swift、Dart、JavaScript、Flutter、Object Pascal、Lazarus、Rust 等编程语言。
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.