#大语言模型#面向所有人的对话式 AI,我们相信我们即将创造一场革命,正如 Stable Diffusion 改变了现代艺术的创作过程, 我们将透过对话式 AI 来改变世界.
CLIP-like model evaluation
Audio Dataset for training CLAP and other models
Open-Sora: 完全开源的高效复现类Sora视频生成方案
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Zero-Shot Speech Editing and Text-to-Speech in the Wild
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
#计算机科学#Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch
[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Learning audio concepts from natural language supervision
#操作系统#SerenityOS 是一款基于X86架构的类 Unix 的图形化操作系统,其UI界面仿90年代设计。
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable mu...
Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024
Generative Models by Stability AI
PyTorch code and models for V-JEPA self-supervised learning from video.