GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

multimodal-generation

Website
Wikipedia
https://static.github-zh.com/github_avatars/eric-ai-lab?size=40
eric-ai-lab / MiniGPT-5

Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"

diffusion-modelsmultimodal-generationtransformers
Python 860
1 个月前
https://static.github-zh.com/github_avatars/YingqingHe?size=40
YingqingHe / Awesome-LLMs-meet-Multimodal-Generation

#大语言模型#🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

aigclarge-language-modelslarge-vision-language-modelsmultimodal-generationmultimodal-large-language-modelsmultimodal-modelsmultimodalitytext-to-3dtext-to-audiotext-to-imagetext-to-speechtext-to-video大语言模型mllm
HTML 478
2 个月前
https://static.github-zh.com/github_avatars/chuhaojin?size=40
chuhaojin / Text2Poster-ICASSP-22

#计算机科学#Official implementation of the ICASSP-2022 paper "Text2Poster: Laying Out Stylized Texts on Retrieved Images"

aigc深度学习multimodal-generation图像处理image-retrievalartificial-neural-networksPyTorchobject-detectionimage-text-retrieval
Python 211
1 年前
https://static.github-zh.com/github_avatars/YangLing0818?size=40
YangLing0818 / ContextDiff

[ICLR 2024] Contextualized Diffusion Models for Text-Guided Image and Video Generation

diffusion-modelstext-to-image-generationtext-to-videomultimodal-generation
Python 67
1 年前
https://static.github-zh.com/github_avatars/wzk1015?size=40
wzk1015 / Awesome-Vision-to-Music-Generation

A curated list of vision-to-music generation: methods, datasets, evaluation and challenges.

music-generationsurveymultimodal-generation
62
1 个月前
https://static.github-zh.com/github_avatars/Gen-Verse?size=40
Gen-Verse / HermesFlow

HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation

multimodal-large-language-modelsimage-to-textmultimodal-generationtext-to-image
Python 60
4 个月前
https://static.github-zh.com/github_avatars/Nithin-GK?size=40
Nithin-GK / UniteandConquer

[CVPR '23] Unite and Conquer: Plug & Play Multi-Modal Synthesis using Diffusion Models

diffusion-modelsface-generationimagenetmultimodalmultimodal-deep-learningtext-to-imagemultimodal-generationplug-and-playtext-to-image-diffusiontext-to-image-generationtext-to-image-synthesissemantic-segmentation
Python 36
1 年前
https://static.github-zh.com/github_avatars/PanguIR?size=40
PanguIR / MRAGSurvey

A Survey of Multimodal Retrieval-Augmented Generation

large-language-models大语言模型multimodal-generationmultimodal-large-language-models
18
2 个月前