GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

lmm

Website
Wikipedia
https://static.github-zh.com/github_avatars/BAAI-Agents?size=40
BAAI-Agents / Cradle

#大语言模型#The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, ...

ai-agentai-agents-frameworkcomputer-controlcradlegccgenerative-aigroundinglarge-language-models大语言模型lmmmultimodalityvision-language-modelvlm人工智能
Python 2.11 k
7 个月前
https://static.github-zh.com/github_avatars/mbzuai-oryx?size=40
mbzuai-oryx / groundingLMM

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

foundation-modelslmmvision-and-languagevision-language-modelllm-agent
Python 888
7 天前
https://static.github-zh.com/github_avatars/NVlabs?size=40
NVlabs / EAGLE

#大语言模型#Eagle Family: Exploring Model Designs, Data Recipes and Training Strategies for Frontier-Class Multimodal LLMs

Demogpt4huggingfacellamallama3llavalmmmllm大语言模型large-language-models
Python 792
2 个月前
https://static.github-zh.com/github_avatars/LLaVA-VL?size=40
LLaVA-VL / LLaVA-Interactive-Demo

LLaVA-Interactive-Demo

lmmmultimodal
Python 372
1 年前
https://static.github-zh.com/github_avatars/tianyi-lab?size=40
tianyi-lab / HallusionBench

#大语言模型#[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

benchmarkvlmsgpt-4gpt-4vllavabenchmarkshallucination大语言模型lmmlarge-language-modelslarge-vision-language-models
Python 284
7 个月前
https://static.github-zh.com/github_avatars/mbzuai-oryx?size=40
mbzuai-oryx / Video-LLaVA

#大语言模型#PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models

大语言模型lmmVideogroundingtranscription
Python 256
1 年前
https://static.github-zh.com/github_avatars/CircleRadon?size=40
CircleRadon / TokenPacker

The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM", IJCV2025

connectorlmmmllm
Python 248
20 天前
https://static.github-zh.com/github_avatars/Javis603?size=40
Javis603 / Discord-AIBot

#大语言模型#🤖 Discord AI assistant with OpenAI, Gemini, Claude & DeepSeek integration, multilingual support, multimodal chat, image generation, web search, and deep thinking | 一个强大的 Discord AI 助手,整合多种顶级 AI 模型,支持...

人工智能聊天机器人ChatGPTclaudedeepseekDiscorddiscord-botDiscord.JSgemini大语言模型lmmNode.jsopenaixai
JavaScript 230
3 个月前
https://static.github-zh.com/github_avatars/TIGER-AI-Lab?size=40
TIGER-AI-Lab / Mantis

Official code for Paper "Mantis: Multi-Image Instruction Tuning" [TMLR2024]

languagevisionlmmmllmVideovlmmultimodal
Python 215
3 个月前
https://static.github-zh.com/github_avatars/TideDra?size=40
TideDra / VL-RLHF

#大语言模型#A RLHF Infrastructure for Vision-Language Models

dpo大语言模型lmmmllmrlhfvlm
Python 176
7 个月前
https://static.github-zh.com/github_avatars/xieyuquanxx?size=40
xieyuquanxx / awesome-Large-MultiModal-Hallucination

😎 curated list of awesome LMM hallucinations papers, methods & resources.

hallucinationmulti-modallmmmultimodal
149
1 年前
https://static.github-zh.com/github_avatars/Q-Future?size=40
Q-Future / A-Bench

[ICLR 2025] What do we expect from LMMs as AIGI evaluators and how do they perform?

evaluationlmm
144
4 个月前
https://static.github-zh.com/github_avatars/Chenyu-Wang567?size=40
Chenyu-Wang567 / MLLM-Tool

#大语言模型#MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning

gpt4大语言模型lmm
Python 124
1 年前
https://static.github-zh.com/github_avatars/graphic-design-ai?size=40
graphic-design-ai / graphist

#大语言模型#Official Repo of Graphist

graphic-design大语言模型lmmmllm
117
1 年前
https://static.github-zh.com/github_avatars/WisconsinAIVision?size=40
WisconsinAIVision / YoLLaVA

#大语言模型#🌋👵🏻 Yo'LLaVA: Your Personalized Language and Vision Assistant

llava大语言模型lmmlmmspersonalizationneurips
Python 100
3 个月前
https://static.github-zh.com/github_avatars/uni-medical?size=40
uni-medical / GMAI-MMBench

#大语言模型#GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI.

benchmark大语言模型lmmmedicalvlm
68
6 个月前
https://static.github-zh.com/github_avatars/mbzuai-oryx?size=40
mbzuai-oryx / VideoGLaMM

[CVPR 2025 🔥]A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

foundation-modelsllm-agentlmmvision-and-languagevision-language-model
Python 65
2 个月前
https://static.github-zh.com/github_avatars/mapluisch?size=40
mapluisch / LLaVA-CLI-with-multiple-images

LLaVA inference with multiple images at once for cross-image analysis.

图像处理inferencellama2llavaPythonlmmlmmspillowPyTorchvisual-question-answeringvqa
Python 51
1 年前
https://static.github-zh.com/github_avatars/yisuanwang?size=40
yisuanwang / Idea23D

[COLING 2025] Idea23D: Collaborative LMM Agents Enable 3D Model Generation from Interleaved Multimodal Inputs

3Daigcagentlmm
Jupyter Notebook 49
5 个月前
https://static.github-zh.com/github_avatars/360CVGroup?size=40
360CVGroup / Inner-Adaptor-Architecture

LMM solved catastrophic forgetting, AAAI2025

large-multimodal-modelslmm
Python 43
2 个月前
loading...