wavlm · GitHub Topics

#计算机科学#StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

深度学习 PyTorch speaker-adaptation speech-synthesis text-to-speech tts wavlm diffusion-models latent-diffusion latent-diffusion-models Generative Adversarial Network

Python 5.96 k

1 年前

s3prl / s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit

speech-representation mockingjay representation-learning apc tera self-supervised-learning speech-pretraining vq-apc wav2vec hubert wavlm

Python 2.45 k

3 个月前

wenet-e2e / wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

production-ready PyTorch resnet speaker-recognition speaker-verification speaker-diarization repvgg TLS (Transport Layer Security)dino wavlm

Python 1.03 k

4 天前

lucadellalib / focalcodec

#计算机科学#A low-bitrate single-codebook 16 kHz speech codec based on focal modulation

codec 深度学习 PyTorch speech-synthesis wavlm

Python 95

7 个月前

mjhydri / Singing-Vocal-Beat-Tracking

This repo contains the source code of the first deep learning-base singing voice beat tracking system. It leverages WavLM and DistilHuBERT pre-trained speech models to create vocal embeddings and trai...

beat-tracking hubert music music-information-retrieval self-supervised singing-voice wavlm

Python 33

3 年前

lucadellalib / audiocodecs

A collections of audio codecs with a standardized API

codec dac PyTorch quantization self-supervised-learning speech-synthesis text-to-speech wavlm

Python 29

4 个月前

lucadellalib / discrete-wavlm-codec

A neural speech codec based on discrete WavLM representations

clustering codec hifi-gan PyTorch quantization self-supervised-learning speech-synthesis wavlm

Python 24

1 年前

Sarasadeghii / Sharif-WavLM

In this repository, the wavLM model is used for quality and poor quality data for speaker verification task, and the PyCM library is used for evaluation.

confusion-matrix speaker-verification wavlm

Jupyter Notebook 9

2 年前

alessandropec / data_driven_ai_voice_cloning

#计算机科学#This repository contain the code of the main part of my master thesis degree at Politecnico di Torino in Data science & Engineering

人工智能 generative-ai speaker-verification text-to-speech 声音克隆 zero-shot-learning 深度学习机器学习 wavlm fastspeech2 tacotron2

Python 8

3 年前

theolepage / wavlm_ssl_sv

SOTA method for self-supervised speaker verification leveraging a large-scale pretrained ASR model.

asr dino PyTorch self-supervised-learning speaker-recognition speaker-verification wavlm

Python 8

7 个月前

bunyaminergen / WavLMMSDD

This repository combines `WavLM`, a powerful speech representation model from Microsoft, with `MSDD` (Multi-Scale Diarization Decoder), a state-of-the-art approach for speaker diarization from Nvidia...

diarization embedding Microsoft speaker-diarization speech wavlm

Jupyter Notebook 8

3 个月前

sadPororo / UniPool-SV

Universal Pooling Method for Speaker Verification Utilizing Pre-trained Multi-layer Features, 2025 preprint

hubert pretrained-models speaker-recognition speaker-verification wavlm

Python 7

1 年前

SmoothKen / knn-svc

kNN-SVC: Robust Zero-Shot Singing Voice Conversion with Additive Synthesis and Concatenation Smoothness Optimization

singing-voice-conversion voice-conversion wavlm

Python 6

3 个月前

sadPororo / LAP

Rethinking Leveraging Pre-Trained Multi-Layer Representations for Speaker Verification, 2025 Interspeech

hubert pretrained-models speaker-verification wavlm

Python 6

4 个月前

zhu00121 / Universal-representation-dynamics-of-deepfake-speech

This repo contains code used in the paper "Characterizing the temporal dynamics of universal speech representations for generalizable deepfake detection"

deepfake-detection self-supervised wavlm

Python 4

2 年前

bunyaminergen / WavLMRawNetXSVBase

WavLM Large + RawNetX Speaker Verification Base: End-to-End Speaker Verification Architecture

audio feature-extraction speaker-verification speech speech-processing wavlm

Python 3

6 个月前

aitor-alvarez / acoustic-transformer-models

Acoustic Transformer Models for Audio Classification

acoustic classification hubert transformer-models pytorch-lightning wavlm

Python 1

7 个月前

shasan7 / SER_Transformer

#计算机科学#Speech Emotion Recognition (SER) using the RAVDESS dataset.

audio-processing classification 深度学习 hubert huggingface hyperparameter-tuning Python speech-processing transformer wavlm

Jupyter Notebook 0

2 个月前

rick-sanchez-ue / language-diffusion

#大语言模型#📚 Implement diffusion language models in under 80 lines of code using transformers, enabling quick finetuning and efficient training on various datasets.

深度学习 diffusion gradio img2img latent-diffusion llama 大语言模型 Python scalability stability-diffusion stable-diffusion-webui text text-to-speech text2image transformers txt2img wavlm

Python 0

18 天前

lucadellalib / cryceleb2023

CryCeleb2023 experiments

metric-learning speaker-verification triplet-loss wavlm

Jupyter Notebook 0

2 年前