Synchronized Translation for Videos. Video dubbing
#大语言模型#turnkey self-hosted offline transcription and diarization service with llm summary
UniSpeech - Large Scale Self-Supervised Learning for Speech
Open source inference code for Rev's model
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code
🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨
#大语言模型#Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers.
#计算机科学#A lightweight library to compute Diarization Error Rate (DER).
Tool for automatic transcription and speaker diarization based on whisper and pyannote.
#计算机科学#On-device speaker diarization powered by deep learning
Neural network based similarity scoring for diarization (pytorch implementation of "LSTM based Similarity Measurement with Spectral Clustering for Speaker Diarization")
EchoInStone is an audio processing tool that transcribes, diarizes, and aligns speaker segments from audio files, prioritizing accuracy and reliability.
Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.
Convert kaldi feature extraction and nnet3 models into Tensorflow Lite models. Currently aimed at converting kaldi's x-vector models and diarization pipelines to tensorflow models.
Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.
PAFTS : Library That Preprocessing Audio For TTS.
Free open-source transcriber and summarizer for file-per-speaker recordings, such as Discord calls recorded by the Craig bot
A Whisper to TextGrid script that I use to automatize Corpus Annotation on Praat, with speaker diarization.
A Python tool to separate audio files by speaker using diarization data.