GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

large-vision-language-model

Website
Wikipedia
https://static.github-zh.com/github_avatars/BradyFU?size=40
BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

instruction-tuninginstruction-followinglarge-vision-language-modelvisual-instruction-tuningmulti-modalityin-context-learninglarge-language-modelslarge-vision-language-modelsmultimodal-chain-of-thoughtmultimodal-in-context-learningmultimodal-large-language-modelschain-of-thought
15.53 k
3 天前
https://static.github-zh.com/github_avatars/PKU-YuanGroup?size=40
PKU-YuanGroup / Video-LLaVA

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

instruction-tuninglarge-vision-language-modelmulti-modal
Python 3.27 k
6 个月前
https://static.github-zh.com/github_avatars/InternLM?size=40
InternLM / InternLM-XComposer

#大语言模型#InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

ChatGPTvisual-language-learningmulti-modalityfoundationgpt-4instruction-tuningmllmmultimodalvision-language-modellanguage-model大语言模型large-vision-language-modelvision-transformergpt
Python 2.84 k
20 天前
https://static.github-zh.com/github_avatars/PKU-YuanGroup?size=40
PKU-YuanGroup / MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

large-vision-language-modelmixture-of-expertsmoemulti-modal
Python 2.18 k
6 个月前
https://static.github-zh.com/github_avatars/yaotingwangofficial?size=40
yaotingwangofficial / Awesome-MCoT

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

chain-of-thoughtcotdeepseek-r1instruction-tuninglarge-vision-language-modelmultimodalmultimodal-chain-of-thoughtmultimodal-large-language-modelsopenai-o1reasoningsurveymcts
642
1 个月前
https://static.github-zh.com/github_avatars/jqtangust?size=40
jqtangust / hawk

🔥 🔥 🔥 [NeurIPS 2024] Official Implementation of Hawk: Learning to Understand Open-World Video Anomalies

anomaly-detectionlarge-vision-language-modelvideo-understandinganomalyVideovideo-anomaly-detection
Python 204
2 个月前
https://static.github-zh.com/github_avatars/MMStar-Benchmark?size=40
MMStar-Benchmark / MMStar

#大语言模型#[NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"

evaluationlarge-language-modelslarge-multimodal-modelslarge-vision-language-modellarge-vision-language-models大语言模型multimodalmultimodal-learningmultimodalityvisual-question-answering
Python 181
9 个月前
https://static.github-zh.com/github_avatars/yu-rp?size=40
yu-rp / apiprompting

[ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models

large-multimodal-modelslarge-vision-language-modellarge-vision-language-modelspromptingvision-language-modelvisual-prompting
Python 90
8 个月前
https://static.github-zh.com/github_avatars/Orlando-CS?size=40
Orlando-CS / Awesome-VLA

✨✨latest advancements in VLA models(VIsion Language Action)

large-language-modelslarge-vision-language-modelmulti-modality
73
2 个月前
https://static.github-zh.com/github_avatars/richard-peng-xia?size=40
richard-peng-xia / CARES

[NeurIPS'24] CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models

large-vision-language-modelvision-language-model
Python 70
6 个月前
https://static.github-zh.com/github_avatars/Ruiyang-061X?size=40
Ruiyang-061X / VL-Uncertainty

🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".

large-vision-language-modeluncertainty-estimationhallucinationmulti-modaluncertaintyuncertainty-quantificationvision-languagevision-language-model
Python 36
3 个月前
https://static.github-zh.com/github_avatars/SuperBruceJia?size=40
SuperBruceJia / Awesome-Large-Vision-Language-Model

#自然语言处理#Awesome Large Vision-Language Model: A Curated List of Large Vision-Language Model

foundation-modelslarge-language-modelslarge-vision-language-modellarge-vision-language-modelsmultimodal-large-language-modelsvision-and-languageartificial-general-intelligence人工智能机器视觉深度学习机器学习自然语言处理
27
9 个月前
https://static.github-zh.com/github_avatars/ADL-X?size=40
ADL-X / LLAVIDAL

This is the offical repository of LLAVIDAL

action-recognitionlarge-vision-language-modelLLVM
Python 14
3 个月前
https://static.github-zh.com/github_avatars/ai4ce?size=40
ai4ce / LUWA

[CVPR 2024 Highlight] The first benchmark for lithic use-wear analysis leveraging SOTA vision and vision-language models (DINOv2, GPT-4V), demonstrating AI performance surpassing that of expert archae...

ai4science机器视觉large-vision-language-model
Jupyter Notebook 4
3 个月前
https://static.github-zh.com/github_avatars/lucaswychan?size=40
lucaswychan / quant-lvlm

Easy-to-use large vision language model pipeline for quantitative analysis

large-vision-language-modelmultimodal-learningPyTorchquantitative-finance
Python 2
2 个月前
https://static.github-zh.com/github_avatars/lca0503?size=40
lca0503 / MergeToVLRM

Source code of our paper "Transferring Textual Preferences to Vision-Language Understanding through Model Merging", ACL 2025

large-vision-language-modelmodel-merging
Python 2
2 个月前
https://static.github-zh.com/github_avatars/amazon-science?size=40
amazon-science / THRONE

Code release for THRONE, a CVPR 2024 paper on measuring object hallucinations in LVLM generated text.

benchmarkcvpr2024hallucinationhallucinations大语言模型large-language-modelslarge-vision-language-modellarge-vision-language-modelsvision-language-model
Python 1
16 天前
https://static.github-zh.com/github_avatars/pzrain?size=40
pzrain / DiViCo

Official implementation of TCSVT 2025 paper: DiViCo: Disentangled Visual Token Compression For Efficient Large Vision-Language Model

large-vision-language-modelmultimodal
Python 0
1 个月前