Efficient vision foundation models for high-resolution generation and perception.
ICML2025: AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!