A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
faster_whisper GUI with PySide6
Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, Lich...
Voice Activity Detector(VAD) from TEN: low-latency, high-performance and lightweight
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
An audio/acoustic activity detection and audio segmentation tool
#人脸识别#ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processi...
#计算机科学#An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
#IOS#Fully Native Swift and CoreML. Efficient Speaker Diarization, VAD, and Speech-to-Text for realtime workloads
Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.
#安卓#Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.
#计算机科学#Voice Activity Detection based on Deep Learning & TensorFlow
Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts
On-device voice activity detection (VAD) powered by deep learning
A statistical model-based Voice Activity Detection
Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021