#安卓#Vosk 是一个离线的语言识别工具。支持 Python, Java, Node.JS, C#, C++ ,能识别20+种语言,包括中文、英语、法语等。
#计算机科学#SincNet is a neural architecture for efficiently processing raw audio samples.
PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
#计算机科学#The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN a...
#计算机科学#Simple d-vector based Speaker Recognition (verification and identification) using Pytorch
Speaker Identification System (upto 100% accuracy); built using Python 2.7 and python_speech_features library
#计算机科学#Identifying people from small audio fragments
#计算机科学#Deep Learning - one shot learning for speaker recognition using Filter Banks
#计算机科学#Official Implementation of the work "Audio Mamba: Bidirectional State Space Model for Audio Representation Learning"
A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.
#计算机科学#[SLT'24] The official implementation of SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model
This repo contains my attempt to create a Speaker Recognition and Verification system using SideKit-1.3.1
Source code for paper "Who is real Bob? Adversarial Attacks on Speaker Recognition Systems" (IEEE S&P 2021)
Pytorch implementation of "Generalized End-to-End Loss for Speaker Verification"
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
#自然语言处理#打造最简单的TTS前端集合,最简单的有声小说制作工作流。基于正则规则对小说进行分句,基于RoBERTa对小说中的对话进行说话人识别,从而实现一键式生成多人有声小说。多说话人的语音合成,高质量的有声小说制作。
Pytorch implementation of Generalized End-to-End Loss for speaker verification
A tool for summarizing dialogues from videos or audio
mirror of VoxCeleb dataset - a large-scale speaker identification dataset