#

asr

https://static.github-zh.com/github_avatars/m-bain?size=40

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 17.78 k
3 个月前
https://static.github-zh.com/github_avatars/NVIDIA-NeMo?size=40

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 15.72 k
6 小时前
https://static.github-zh.com/github_avatars/alphacep?size=40
Jupyter Notebook 13.22 k
8 天前
https://static.github-zh.com/github_avatars/PaddlePaddle?size=40

PaddleSpeech 是基于飞桨 PaddlePaddle 的语音方向的开源模型库,用于语音和音频中的各种关键任务的开发,典型的应用包括:语音识别、语音翻译、语音合成等

Python 12.23 k
5 天前
https://static.github-zh.com/github_avatars/k2-fsa?size=40

#安卓#Sherpa-ONNX 是一个轻量级语音识别框架, 基于 Kaldi 和 onnxruntime,无需联网即可实现语音转文本、文本转语音、说话人分离以及语音活动检测(VAD)。支持嵌入式系统、安卓、iOS、鸿蒙系统、树莓派、RISC-V、x86_64 服务器、WebSocket 服务器 / 客户端,以及 C/C++、Python、Kotlin、C#、Go、NodeJS、Java、Swift、Dart、JavaScript、Flutter、Object Pascal、Lazarus、Rust 等编程语言。

C++ 7.45 k
3 小时前
wzpan/wukong-robot
https://static.github-zh.com/github_avatars/wzpan?size=40

#大语言模型#🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目,支持ChatGPT多轮对话能力,还可能是首个支持脑机交互的开源智能音箱项目。

Python 6.97 k
1 年前
https://static.github-zh.com/github_avatars/jdepoix?size=40

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless b...

Python 6.18 k
8 天前
https://static.github-zh.com/github_avatars/snakers4?size=40

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

Jupyter Notebook 5.49 k
2 年前
https://static.github-zh.com/github_avatars/xiangyuecn?size=40

html5 js 录音 mp3 wav ogg webm amr g711a g711u 格式,支持pc和Android、iOS部分浏览器、Hybrid App(提供Android iOS App源码)、微信,提供ASR语音识别转文字 H5版语音通话聊天示例 DTMF编码解码

JavaScript 5.42 k
6 个月前
https://static.github-zh.com/github_avatars/MahmoudAshraf97?size=40

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Jupyter Notebook 4.96 k
1 个月前
https://static.github-zh.com/github_avatars/wenet-e2e?size=40
Python 4.8 k
8 天前
https://static.github-zh.com/github_avatars/PeterH0323?size=40

#大语言模型#Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁,一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️、Vue 生态搭建前端🍍、F...

Python 3.48 k
6 个月前
https://static.github-zh.com/github_avatars/CheshireCC?size=40
Python 2.66 k
9 个月前
https://static.github-zh.com/github_avatars/coqui-ai?size=40

#计算机科学#🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

C++ 2.51 k
2 年前
https://static.github-zh.com/github_avatars/Purfview?size=40
2.48 k
5 个月前
loading...
Website
Wikipedia