#

speech-to-text

ggml-org/whisper.cpp
https://static.github-zh.com/github_avatars/ggml-org?size=40
C++ 43.29 k
10 天前
https://static.github-zh.com/github_avatars/mozilla?size=40

#计算机科学#DeepSpeech 是一款开源嵌入式(离线、设备上)语音识别引擎,最低可以在树莓派上运行

C++ 26.6 k
3 个月前
https://static.github-zh.com/github_avatars/m-bain?size=40

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 17.78 k
3 个月前
https://static.github-zh.com/github_avatars/kaldi-asr?size=40
Shell 15.12 k
2 个月前
https://static.github-zh.com/github_avatars/jianchang512?size=40

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,同时支持语音识别转录、语音合成、字幕翻译。

Python 14.24 k
5 天前
https://static.github-zh.com/github_avatars/alphacep?size=40
Jupyter Notebook 13.22 k
8 天前
https://static.github-zh.com/github_avatars/Uberi?size=40

Speech recognition module for Python, supporting several engines and APIs, online and offline.

Python 8.86 k
4 天前
https://static.github-zh.com/github_avatars/KoljaB?size=40

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

Python 8.6 k
2 个月前
https://static.github-zh.com/github_avatars/nl8590687?size=40

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

Python 8.23 k
12 天前
https://static.github-zh.com/github_avatars/k2-fsa?size=40

#安卓#Sherpa-ONNX 是一个轻量级语音识别框架, 基于 Kaldi 和 onnxruntime,无需联网即可实现语音转文本、文本转语音、说话人分离以及语音活动检测(VAD)。支持嵌入式系统、安卓、iOS、鸿蒙系统、树莓派、RISC-V、x86_64 服务器、WebSocket 服务器 / 客户端,以及 C/C++、Python、Kotlin、C#、Go、NodeJS、Java、Swift、Dart、JavaScript、Flutter、Object Pascal、Lazarus、Rust 等编程语言。

C++ 7.45 k
3 小时前
https://static.github-zh.com/github_avatars/TalAter?size=40
JavaScript 6.66 k
1 年前
https://static.github-zh.com/github_avatars/snakers4?size=40

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

Jupyter Notebook 5.49 k
2 年前
https://static.github-zh.com/github_avatars/modelscope?size=40

#大语言模型#Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

Python 4.97 k
2 个月前
https://static.github-zh.com/github_avatars/MahmoudAshraf97?size=40

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Jupyter Notebook 4.96 k
1 个月前
abus-aikorea/voice-pro
https://static.github-zh.com/github_avatars/abus-aikorea?size=40

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isola...

Python 4.81 k
2 个月前
https://static.github-zh.com/github_avatars/sanchit-gandhi?size=40

#计算机科学#JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

Jupyter Notebook 4.63 k
1 年前
loading...
Website
Wikipedia