Loading

该仓库已收录但尚未编辑。项目介绍及使用教程请前往 GitHub 阅读 README


0 条讨论

登录后发表评论

关于

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

创建时间
是否国产

  修改时间

2025-01-08T10:09:47Z


语言

  • Python88.7%
  • Shell11.2%
  • 其他0.01%

huggingface 的其他开源项目

Python150.73 k
11 小时前

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT)...

Python35.42 k
1 天前

#计算机科学#🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

Python31.01 k
13 小时前

🤗 smolagents: a barebones library for agents that think in code.

Python22.9 k
20 天前

您可能感兴趣的

大模型Grok-1开源

Python50.53 k
1 年前

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Jupyter Notebook8.4 k
7 个月前
Jupyter Notebook8.6 k
1 年前
Python18.44 k
2 个月前

Devika is now Opcode

Python19.49 k
13 天前

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python18.05 k
5 天前

#计算机科学#JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

Jupyter Notebook4.64 k
2 年前
Python64.01 k
4 小时前

A framework for Claude Opus to intelligently orchestrate subagents.

Python4.28 k
1 年前

#大语言模型#Automate browser-based workflows with LLMs and Computer Vision

Python14.53 k
2 小时前

Open-Sora: 完全开源的高效复现类Sora视频生成方案

Python27.33 k
5 个月前

whisper 是一个通用语音识别模型

Python89.09 k
1 个月前

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

Python6.84 k
9 个月前
ggml-org/whisper.cpp

OpenAI Whisper语音识别模型,C++移植版本。

C++43.68 k
2 天前

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

Python5 k
1 年前
SWE-agent/SWE-agent

#大语言模型#SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]

Python17.52 k
1 天前
C++4.05 k
6 个月前
openinterpreter/01

The #1 open-source voice interface for desktop, mobile, and ESP32 chips.

Python5.09 k
1 年前

#计算机科学#StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python5.99 k
1 年前

#计算机科学#🐸💬 - 一个深度学习的 TTS 语言合成库

Python42.88 k
1 年前