Official repository for LTX-Video
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
LTX-Video Support for ComfyUI
Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Training released! Surpasses GPT-4o in ID persistence! Official ComfyUI workflow release! Only 4GB VRAM is enough ...
Official implementation for "RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers" (ICML 2025)
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high per...
⚡ Flash Diffusion ⚡: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation (AAAI 2025 Oral)
OpenMusic: SOTA Text-to-music (TTM) Generation
📚A curated list of Awesome DiT Inference Papers with Codes.
MoH: Multi-Head Attention as Mixture-of-Head Attention
NMCN(Niche Multi Channel Network),小眾多頻道網絡,是「同和新媒體矩陣」創始團隊於輿論資本全球化背景下率先提出的一種非營利性的去中心化自媒體聯盟形式,通過聯盟內創作單位的交流互推、共享資源等方式對抗資本侵蝕,在產出卓越作品的同時保障亞文化生存空間,為守護寶貴的非物質文化遺產盡綿薄之力。
CogVideoX-5B 4-bit quantization model
Just another reasonably minimal repo for class-conditional training of pixel-space diffusion transformers.
Official implementation of "JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization"
An open source community implementation of the model from the paper: "Movie Gen: A Cast of Media Foundation Models". Join our community to help implement this model!
This is the project for 'Any2Caption', Interpreting Any Condition to Caption for Controllable Video Generation
UK - Great.gov - Export Opportunities - Find and apply for overseas opportunities from businesses looking for products or services like yours.