speech-representation · GitHub Topics

Self-Supervised Speech Pre-training and Representation Learning Toolkit

speech-representation mockingjay representation-learning apc tera self-supervised-learning speech-pretraining vq-apc wav2vec hubert wavlm

Python 2.44 k

2 个月前

jishengpeng / WavTokenizer

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

acoustic codec gpt4o semantic speech-representation text-to-speech dac

Python 1.17 k

5 个月前

ddlBoJack / emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

pytorch-implementation speech-representation

Python 911

7 个月前

jishengpeng / WavChat

A Survey of Spoken Dialogue Models (60 pages)

gpt-4o speech speech-representation streaming

306

8 个月前

Ereboas / MagiCodec

#大语言模型#A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.

codec 大语言模型 PyTorch speech-representation text-to-speech tts

Python 84

2 个月前

mechanicalsea / lighthubert

LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

neural-architecture-search PyTorch self-supervised-learning speech-representation

Python 74

3 年前

ryota-komatsu / slp2025

音学シンポジウム2025チュートリアル「マルチモーダル大規模言語モデル入門」資料

multimodal-large-language-models speech speech-processing speech-representation

Jupyter Notebook 58

1 个月前

andi611 / Mockingjay-Speech-Representation

Official Implementation of Mockingjay in Pytorch

speech speech-representation mockingjay representation-learning feature-extraction sentiment-classification speaker-recognition PyTorch pytorch-implementation apc

Python 55

2 年前

vectominist / MiniASR

A mini, simple, and fast end-to-end automatic speech recognition toolkit.

asr ctc speech-recognition speech-representation hubert minimal PyTorch fairseq

Jupyter Notebook 54

3 年前

gyt1145028706 / XY-Tokenizer

This is the code for paper: XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs. Demos, technical insights and experimental results are presented on

autoencoder automatic-speech-recognition speech-representation

Python 38

20 天前