Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
2024-03-26
否
2024-05-04T14:36:51Z
#大语言模型#Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
#大语言模型#Project Page for "LISA: Reasoning Segmentation via Large Language Model"
VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking (CVPR 2023)
Underexposed Photo Enhancement Using Deep Illumination Estimation
Open-Sora: 完全开源的高效复现类Sora视频生成方案
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Devika is now Opcode
Zero-Shot Speech Editing and Text-to-Speech in the Wild
DeepSeek-VL: Towards Real-World Vision-Language Understanding
AIOS: AI Agent Operating System
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
DUSt3R: Geometric 3D Vision Made Easy
#大语言模型#利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
TripoSR: Fast 3D Object Reconstruction from a Single Image
OpenUI let's you describe UI using your imagination, then see it rendered live.
[WIP] Layer Diffusion for WebUI (via Forge)
Mora: More like Sora for Generalist Video Generation
#大语言模型#Large Action Model framework to develop AI Web Agents
[ECCV 2024] Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
#大语言模型#Code examples and resources for DBRX, a large language model developed by Databricks
0 条讨论