GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

cogvlm

Website
Wikipedia
https://static.github-zh.com/github_avatars/THUDM?size=40
THUDM / CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

cogvlmpretrained-modelslanguage-modelmulti-modal
Python 2.36 k
3 个月前
https://static.github-zh.com/github_avatars/jhc13?size=40
jhc13 / taggui

Tag manager and captioner for image datasets

image-captioningpyside6stable-diffusionllavacogvlmflorence-2
Python 1.02 k
1 个月前
https://static.github-zh.com/github_avatars/gokayfem?size=40
gokayfem / awesome-vlm-architectures

#Awesome#Famous Vision Language Models and Their Architectures

clipllavavlmmultimodalblipcogvlminternlmkosmosvision-language-modelAwesome Lists
Markdown 859
4 个月前
https://static.github-zh.com/github_avatars/ProGamerGov?size=40
ProGamerGov / VLM-Captioning-Tools

#大语言模型#Python scripts to use for captioning images with VLMs

cogvlmimage-captioningvlmmistralvision-language大语言模型llama3
Python 41
2 个月前
https://static.github-zh.com/github_avatars/nopperl?size=40
nopperl / clip-synthetic-captions

Tiny-scale experiment showing that CLIP models trained using detailed captions generated by multimodal models (CogVLM and LLaVA 1.5) outperform models trained using the original alt-texts on a range o...

clipcogvlmllavamultimodalsynthetic-datavision-language-model
Python 3
1 年前
https://static.github-zh.com/github_avatars/williamcfrancis?size=40
williamcfrancis / vlm-comparison-gemini-cog

A comparitive study between the two of the best performing open source Vision Language Models - Google Gemini Vision and CogVLM

人工智能cogvlmgeminigemini-progoogle-geminivisionvision-and-languagevision-language-modelvlm
Python 0
1 年前