SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
LTX-Video Support for ComfyUI
Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)
[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
#数据仓库#[AAAI 2025]👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing. It enables customizable human image generation with flexible garment, pose, and scene control, ensuring high f...
Rich-Text-to-Image Generation
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
Awesome Unified Multimodal Models
Liquid: Language Models are Scalable and Unified Multi-modal Generators
AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation
#Awesome#The Paper List of Large Multi-Modality Model (Perception, Generation, Unification), Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insigh...
[CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".
[NeurIPS'23] "MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing".
#计算机科学#Colab notebook for Stable Diffusion Hyper-SDXL.
[CVPR 2025] Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization
Faster generation with text-to-image diffusion models.
Official repository for "CFG++: manifold-constrained classifier free guidance for diffusion models" (ICLR2025)
[NeurIPS-2023] Annual Conference on Neural Information Processing Systems
🔥 [CVPR2024] Official implementation of "Self-correcting LLM-controlled Diffusion Models (SLD)
The most advanced Nano Banana image generator and editor application. Your central hub for AI image generation and revisions. Intuitive UI features reference images, editing with image masks, version ...