GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

language-vision

Website
Wikipedia
unum-cloud/uform
https://static.github-zh.com/github_avatars/unum-cloud?size=40
unum-cloud / uform

#向量搜索引擎#Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️

huggingface-transformerslanguage-visionmultimodalPyTorchsemantic-searchtransformercross-attentionvector-searchbert神经网络pretrained-modelsmulti-lingualclipopenaicontrastive-learningrepresentation-learningclusteringimage-searchllava
Python 1.16 k
1 个月前
https://static.github-zh.com/github_avatars/JacobYuan7?size=40
JacobYuan7 / RLIPv2

[ICCV 2023] RLIPv2: Fast Scaling of Relational Language-Image Pre-training

detectionlanguage-vision
Python 130
1 年前
https://static.github-zh.com/github_avatars/Fsoft-AIC?size=40
Fsoft-AIC / Language-Conditioned-Affordance-Pose-Detection-in-3D-Point-Clouds

[ICRA 2024] Language-Conditioned Affordance-Pose Detection in 3D Point Clouds

diffusion-modelslanguage-visionpose-estimationRobotics
Python 37
7 个月前
https://static.github-zh.com/github_avatars/jwu114?size=40
jwu114 / CAP

[NAACL Findings 2025] Code and data of "Mitigating Hallucinations in Multimodal Spatial Relations through Constraint-Aware Prompting"

promptingvqalanguage-visionmultimodal
Python 3
3 个月前
https://static.github-zh.com/github_avatars/youcefgheffari3?size=40
youcefgheffari3 / VisualGroundingAutonomy

#计算机科学#Visual Grounding for Autonomous Agents: linking language and vision for robotics or autonomous navigation

autonomy深度学习language-visionRobotics
Python 2
6 天前
https://static.github-zh.com/github_avatars/CharlesYang030?size=40
CharlesYang030 / MTA

MTA: A Lightweight Multilingual Text Alignment Model for Cross-language Visual Word Sense Disambiguation

language-visionmultilingualmultimodal
Jupyter Notebook 1
2 年前
https://static.github-zh.com/github_avatars/ElDokmak?size=40
ElDokmak / MultiModal-Models

Hands on some MultiModal Models

language-visionllavamultimodalitygpt-4-visiontts
Jupyter Notebook 0
2 年前