GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

text-to-audio

Website
Wikipedia
open-mmlab/Amphion
https://static.github-zh.com/github_avatars/open-mmlab?size=40
open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...

audio-generationaudio-synthesisaudioldmmusic-generationnaturalspeech2singing-voice-conversionspeech-synthesistext-to-audiotext-to-speechvall-evoice-conversionauditfastspeech2vitsemiliamaskgctvocoder
Python 9.15 k
20 天前
https://static.github-zh.com/github_avatars/hkchengrex?size=40
hkchengrex / MMAudio

#计算机科学#[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

audioaudio-synthesis机器视觉深度学习text-to-audio
Python 1.58 k
1 个月前
https://static.github-zh.com/github_avatars/gitmylo?size=40
gitmylo / audio-webui

A webui for different audio related Neural Networks

人工智能audioldmbarkrvctext-to-audiotext-to-speech声音克隆audiocraftmusicgenerative-musicttsaioall-in-one
Python 1.17 k
1 个月前
declare-lab/tango
https://static.github-zh.com/github_avatars/declare-lab?size=40
declare-lab / tango

A family of diffusion models for text-to-audio generation.

audio-generationdiffusiondiffusion-modelslanguage-modelslarge-language-modelstext-to-audio
Python 1.17 k
6 个月前
https://static.github-zh.com/github_avatars/ictnlp?size=40
ictnlp / StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

speechspeech-recognitionspeech-synthesisspeech-to-textspeech-translationtranslationall-in-onemachine-translationstreaming-audiotext-to-speechasrttsvoicetext-to-audionon-autoregressivespeech-enhancementaudio-processingspeech-processing
Python 1.09 k
10 个月前
https://static.github-zh.com/github_avatars/declare-lab?size=40
declare-lab / TangoFlux

TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching

generative-aitext-to-audio
Jupyter Notebook 737
7 天前
https://static.github-zh.com/github_avatars/Text-to-Audio?size=40
Text-to-Audio / Make-An-Audio

PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model

diffusion-modelslatent-diffusionlatent-spacetext-to-audio
Python 648
1 年前
https://static.github-zh.com/github_avatars/ivcylc?size=40
ivcylc / OpenMusic

OpenMusic: SOTA Text-to-music (TTM) Generation

人工智能diffusion-modelsmusic-generationtext-to-audioai-musicaudioldmdiffusion-transformerdithifi-ganvall-e
Python 568
2 个月前
https://static.github-zh.com/github_avatars/lucidrains?size=40
lucidrains / nuwa-pytorch

#计算机科学#Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch

人工智能深度学习transformersattention-mechanismtext-to-videotext-to-audio
Python 549
2 年前
https://static.github-zh.com/github_avatars/YingqingHe?size=40
YingqingHe / Awesome-LLMs-meet-Multimodal-Generation

#大语言模型#🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

aigclarge-language-modelslarge-vision-language-modelsmultimodal-generationmultimodal-large-language-modelsmultimodal-modelsmultimodalitytext-to-3dtext-to-audiotext-to-imagetext-to-speechtext-to-video大语言模型mllm
HTML 478
2 个月前
https://static.github-zh.com/github_avatars/AMAAI-Lab?size=40
AMAAI-Lab / mustango

Mustango: Toward Controllable Text-to-Music Generation

diffusion-modelslarge-language-modelstext-to-audio
Python 366
13 天前
https://static.github-zh.com/github_avatars/denizsafak?size=40
denizsafak / abogen

Generate audiobooks from EPUBs, PDFs and text with synchronized captions.

audiobookaudiobookscontent-creationkokorospeech-synthesissubtitlestext-to-audiotext-to-speechttsvoice-synthesiskokoro-tts
Python 316
8 天前
https://static.github-zh.com/github_avatars/haidog-yaqub?size=40
haidog-yaqub / EzAudio

High-quality Text-to-Audio Generation with Efficient Diffusion Transformer

diffusion-modelsgenerative-aitext-to-audio
Python 279
2 个月前
https://static.github-zh.com/github_avatars/happylittlecat2333?size=40
happylittlecat2333 / Auffusion

Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"

audio-generationdiffusiondiffusion-modelslarge-language-modelstext-to-audio
Jupyter Notebook 184
1 年前
https://static.github-zh.com/github_avatars/ilaria-manco?size=40
ilaria-manco / word2wave

Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.

text-to-audioaudio-generationmusic-generationai-music
Python 119
4 年前
https://static.github-zh.com/github_avatars/bnsantoso?size=40
bnsantoso / sub-to-audio

Subtitle to audio, generate audio from any subtitle file using Coqui-ai TTS and synchronize the audio timing according to subtitle time.

text-to-audiotext-to-speechPythonttsaudio-processing
Python 115
2 年前
https://static.github-zh.com/github_avatars/sony?size=40
sony / soundctm

Pytorch implementation of SoundCTM

audio-generationdiffusion-modelsPyTorchtext-to-audio
Python 96
3 个月前
https://static.github-zh.com/github_avatars/keonlee9420?size=40
keonlee9420 / WaveGrad2

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

text-to-speechneural-ttsaudiosynthesisnon-autoregressivescore-matchingdurationrobustPyTorchttsspeech-synthesistext-to-audioend-to-end
Python 69
4 年前
https://static.github-zh.com/github_avatars/RhythrosaLabs?size=40
RhythrosaLabs / soundstorm

#大语言模型#Soundstorm is a cutting-edge AI-powered audio manipulation application designed to provide a rich yet simplified experience for sound designers, algorithmic composers, and experimental audio enthusias...

algorithmic-compositionaudio-processingchat-gpt聊天机器人ChatGPTgptgpt-4MIDIsoundsound-processingtext-to-audio
Python 31
1 年前
https://static.github-zh.com/github_avatars/PapayaResearch?size=40
PapayaResearch / ctag

#计算机科学#Creative Text-to-Audio Generation via Synthesizer Programming @ ICML'24

机器学习synthesizertext-to-audiogenerative-aijax
Python 28
9 个月前
loading...