GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

asr

Website
Wikipedia
https://static.github-zh.com/github_avatars/m-bain?size=40
m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

asrspeechspeech-recognitionspeech-to-textWhisper
Python 16.26 k
7 天前
https://static.github-zh.com/github_avatars/NVIDIA?size=40
NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

machine-translationspeaker-recognitionasrttsgenerative-aimultimodal深度学习neural-networksspeaker-diariazationspeech-translationspeech-synthesislarge-language-models
Python 14.8 k
12 小时前
https://static.github-zh.com/github_avatars/alphacep?size=40
alphacep / vosk-api

#安卓#Vosk 是一个离线的语言识别工具。支持 Python, Java, Node.JS, C#, C++ ,能识别20+种语言,包括中文、英语、法语等。

speech-recognitionasrvoice-recognitionspeech-to-textAndroidiOS树莓派深度学习深度神经网络speech-to-text-androidspeaker-identificationspeaker-verificationPythonoffline隐私kaldideepspeechgoogle-speech-to-textvoskstt
Jupyter Notebook 12.08 k
1 个月前
https://static.github-zh.com/github_avatars/PaddlePaddle?size=40
PaddlePaddle / PaddleSpeech

PaddleSpeech 是基于飞桨 PaddlePaddle 的语音方向的开源模型库,用于语音和音频中的各种关键任务的开发,典型的应用包括:语音识别、语音翻译、语音合成等

transformerconformerspeech-translationstreaming-asrspeech-alignmentpunctuation-restorationstreaming-ttsspeech-synthesisttsasrspeech-recognition声音克隆vocodervoice-recognitionself-supervised-learningWhisper
Python 11.99 k
5 天前
https://static.github-zh.com/github_avatars/speechbrain?size=40
speechbrain / speechbrain

#计算机科学#A PyTorch-based Speech Toolkit

speech-recognitionspeech-toolkitspeaker-recognitionspeech-to-textspeech-enhancementspeech-separationaudioaudio-processingspeech-processingspeechrecognitionasrvoice-recognitionspeaker-diarizationspeaker-verificationPyTorchhuggingfacetransformerslanguage-model深度学习
Python 9.98 k
5 天前
wzpan/wukong-robot
https://static.github-zh.com/github_avatars/wzpan?size=40
wzpan / wukong-robot

#大语言模型#🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目,支持ChatGPT多轮对话能力,还可能是首个支持脑机交互的开源智能音箱项目。

人工智能speakerasrttsunitHome Assistantraspeberry-piamazon-echoalexasnowboygoogle-homeanyqmusebciChatGPTgpt3openai
Python 6.86 k
8 个月前
https://static.github-zh.com/github_avatars/k2-fsa?size=40
k2-fsa / sherpa-onnx

#安卓#Sherpa-ONNX 是一个轻量级语音识别框架, 基于 Kaldi 和 onnxruntime,无需联网即可实现语音转文本、文本转语音、说话人分离以及语音活动检测(VAD)。支持嵌入式系统、安卓、iOS、鸿蒙系统、树莓派、RISC-V、x86_64 服务器、WebSocket 服务器 / 客户端,以及 C/C++、Python、Kotlin、C#、Go、NodeJS、Java、Swift、Dart、JavaScript、Flutter、Object Pascal、Lazarus、Rust 等编程语言。

asronnxWindowsLinuxmacOSC++AndroidiOS树莓派aarch64arm32C#.NETmfcspeech-to-texttext-to-speechvitsRISC-Vlazarusobject-pascal
C++ 6.36 k
7 天前
https://static.github-zh.com/github_avatars/FunAudioLLM?size=40
FunAudioLLM / SenseVoice

#大语言模型#Multilingual Voice Understanding Model

人工智能asrgpt-4ospeech-recognitionspeech-to-textaigccross-lingual大语言模型PythonPyTorchmultilingual
Python 5.9 k
3 个月前
https://static.github-zh.com/github_avatars/snakers4?size=40
snakers4 / silero-models

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

speech-recognitionspeech-to-textsttasrpretrained-modelsenglishgermanspanishstt-benchmarkPyTorchcolabonnxtext-to-speechspeechspeech-synthesistts
Jupyter Notebook 5.34 k
2 年前
https://static.github-zh.com/github_avatars/xiangyuecn?size=40
xiangyuecn / Recorder

html5 js 录音 mp3 wav ogg webm amr g711a g711u 格式,支持pc和Android、iOS部分浏览器、Hybrid App(提供Android iOS App源码)、微信,提供ASR语音识别转文字 H5版语音通话聊天示例 DTMF编码解码

recorderrecordJavaScriptHTMLh5luyinmp3wavamroggwebmWebRTCaudiorecordingasr
JavaScript 5.28 k
3 个月前
https://static.github-zh.com/github_avatars/jdepoix?size=40
jdepoix / youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless b...

youtube-apisubtitlesYouTubetranscriptsyoutube-subtitlesyoutube-transcriptsPythontranscriptsubtitle命令行界面captionsyoutube-captionsyoutube-transcripttranslating-transcriptsasryoutube-asr
Python 4.94 k
4 天前
https://static.github-zh.com/github_avatars/MahmoudAshraf97?size=40
MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

asrspeaker-diarizationspeechspeech-recognitionspeech-to-textWhisper
Jupyter Notebook 4.63 k
2 个月前
https://static.github-zh.com/github_avatars/NexaAI?size=40
NexaAI / nexa-sdk

#大语言模型#Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (ASR...

asredge-computing大语言模型on-device-aion-device-mlSDKstable-diffusiontransformersttsvlmlanguage-modelsdk-pythonWhisperaudio
Python 4.57 k
3 个月前
https://static.github-zh.com/github_avatars/wenet-e2e?size=40
wenet-e2e / wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

e2e-modelsPyTorchasrtransformerconformerproduction-readyautomatic-speech-recognitionspeech-recognitionWhisper
Python 4.56 k
5 天前
https://static.github-zh.com/github_avatars/PeterH0323?size=40
PeterH0323 / Streamer-Sales

#大语言模型#Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁,一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️、Vue 生态搭建前端🍍、F...

chat-applicationinternlm2大语言模型聊天机器人text-generationchatChatGPTgptragttsasrdigital-human
Python 3.29 k
3 个月前
https://static.github-zh.com/github_avatars/tensorflow?size=40
tensorflow / lingvo

#自然语言处理#Lingvo

speech-recognitiontranslationspeech-to-textmachine-translationmnistseq2seqlanguage-modelttsasrlm自然语言处理Tensorflowspeechresearchdistributedgpu-computingspeech-synthesis
Python 2.84 k
6 天前
https://static.github-zh.com/github_avatars/ahmetoner?size=40
ahmetoner / whisper-asr-webservice

OpenAI Whisper ASR Webservice API

automatic-speech-recognitionspeech-recognitionspeech-to-textopenai-whisperDockerasrspeech
Python 2.66 k
4 个月前
https://static.github-zh.com/github_avatars/CheshireCC?size=40
CheshireCC / faster-whisper-GUI

faster_whisper GUI with PySide6

faster-whisperopenaitranscribevadWhisperwhisperxasr
Python 2.47 k
6 个月前
https://static.github-zh.com/github_avatars/coqui-ai?size=40
coqui-ai / STT

#计算机科学#🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

sttspeech-to-textTensorflow深度学习automatic-speech-recognitionasrvoice-recognitionspeech-recognition
C++ 2.45 k
1 年前
https://static.github-zh.com/github_avatars/linto-ai?size=40
linto-ai / whisper-timestamped

#计算机科学#Multilingual Automatic Speech Recognition with word-level timestamps and confidence

深度学习speechspeech-recognitionspeech-to-textasr机器学习PythonPyTorchattention-is-all-you-needattention-mechanismattention-modelspeaker-diarizationspeech-processingtransformersWhisper
Python 2.45 k
3 个月前
loading...