A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

conformer PyTorch speech-recognition paraformer punctuation

Python12.91 k

6 天前

DiffSynth-Studio

@modelscope

Enjoy the magic of Diffusion models!

Python10.25 k

7 天前

ms-swift

@modelscope

#大语言模型#Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Phi4,...

大语言模型 lora llama sft multimodal

Python10.2 k

1 天前

您可能感兴趣的

GPT-SoVITS

@RVC-Boss

强大的少样本语音转换与语音合成Web用户界面。

text-to-speech tts vits voice-clone voice-cloneai

Python51.34 k

1 个月前

Bert-VITS2

@fishaudio

#大语言模型#vits2 backbone with multilingual-bert

bert bert-vits2 tts vits vits2

Python8.59 k

3 天前

Open-Sora

@hpcaitech

Open-Sora：完全开源的高效复现类Sora视频生成方案

Python27.31 k

5 个月前

Open-Sora-Plan

@PKU-YuanGroup

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python12.03 k

8 天前

ollama

@ollama

#大语言模型#本地化搭建和运行 Llama2 和其他大模型

llama 大语言模型 llama2 Go

Go153.62 k

1 天前

grok-1

@xai-org

大模型Grok-1开源

Python50.52 k

1 年前

LitServe

⚡️ Lightning AI @Lightning-AI

#计算机科学#The easiest way to deploy agents, MCP servers, models, RAG, pipelines and more. No MLOps. No YAML.

人工智能 API serving 深度学习

Python3.58 k

4 天前

ZMM-TTS

@nii-yamagishilab

ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations

C175

2 年前

FunASR

@modelscope

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

conformer PyTorch speech-recognition paraformer punctuation

Python12.91 k

6 天前