#自然语言处理#为 Jax、PyTorch 和 TensorFlow 打造的先进的自然语言处理
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT)...
#计算机科学#🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
🤗 smolagents: a barebones library for agents that think in code.
Zero-Shot Speech Editing and Text-to-Speech in the Wild
#计算机科学#Faster Whisper transcription with CTranslate2
Devika is now Opcode
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
#计算机科学#JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
A framework for Claude Opus to intelligently orchestrate subagents.
Open-Sora: 完全开源的高效复现类Sora视频生成方案
whisper 是一个通用语音识别模型
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
#大语言模型#SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
#计算机科学#Fast inference engine for Transformer models
The #1 open-source voice interface for desktop, mobile, and ESP32 chips.
#计算机科学#StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
0 条讨论