🧠 Leon is your open-source personal assistant.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
PaddleSpeech 是基于飞桨 PaddlePaddle 的语音方向的开源模型库,用于语音和音频中的各种关键任务的开发,典型的应用包括:语音识别、语音翻译、语音合成等
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...
#计算机科学#基于 so-vits-svc4.0(V1)的一个分支,支持实时推理和图形化推理界面,且兼容其模型。
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
#计算机科学#EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
#计算机科学#VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
#计算机科学#StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
#安卓#eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
An Open Source text-to-speech system built by inverting Whisper.
#计算机科学#Foundational model for human-like, expressive TTS
#计算机科学#Speech To Speech: an effort for an open-sourced and modular GPT4-o
😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other lan...