GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

multi-modal-learning

Website
Wikipedia
https://static.github-zh.com/github_avatars/mlfoundations?size=40
mlfoundations / open_clip

#计算机科学#An open source implementation of CLIP.

深度学习PyTorch机器视觉language-modelmulti-modal-learningcontrastive-losszero-shot-classificationpretrained-models
Python 11.93 k
6 天前
https://static.github-zh.com/github_avatars/OFA-Sys?size=40
OFA-Sys / Chinese-CLIP

#自然语言处理#本项目为CLIP模型的中文版本,使用大规模中文数据进行训练(~2亿图文对),旨在帮助用户快速实现中文领域的图文特征&相似度计算、跨模态检索、零样本图片分类等任务

中文机器视觉multi-modal-learning自然语言处理PyTorchvision-and-language-pre-trainingimage-text-retrievalclippretrained-modelsvision-language深度学习multi-modalcontrastive-losstransformerscoreml-models
Python 5.28 k
10 个月前
https://static.github-zh.com/github_avatars/lyuchenyang?size=40
lyuchenyang / Macaw-LLM

#自然语言处理#Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

language-modelmulti-modal-learning自然语言处理深度学习机器学习neural-networks
Python 1.57 k
5 个月前
https://static.github-zh.com/github_avatars/NVlabs?size=40
NVlabs / prismer

The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".

image-captioninglanguage-modelmulti-modal-learningmulti-task-learningvision-language-modelvision-and-languagevqa
Python 1.31 k
1 年前
https://static.github-zh.com/github_avatars/lucidrains?size=40
lucidrains / x-clip

#计算机科学#A concise but complete implementation of CLIP with various experimental improvements from recent papers

人工智能深度学习contrastive-learningzero-shot-learningmulti-modal-learning
Python 713
2 年前
https://static.github-zh.com/github_avatars/jokieleung?size=40
jokieleung / awesome-visual-question-answering

#Awesome#A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.

Awesome Listsvqamulti-modalmulti-modal-learning
662
2 年前
https://static.github-zh.com/github_avatars/OpenRobotLab?size=40
OpenRobotLab / EmbodiedScan

[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

3d-vision机器视觉multi-modal-learningRobotics
Python 600
4 个月前
https://static.github-zh.com/github_avatars/kyegomez?size=40
kyegomez / zeta

#计算机科学#Build high-performance AI models with modular building blocks

人工智能multi-modaltransformers深度学习gpt4llama2multi-agent-systemsmulti-modal-learningmulti-platformPyTorchspeech-recognitiontransformer
Python 524
6 天前
https://static.github-zh.com/github_avatars/DmitryRyumin?size=40
DmitryRyumin / CVPR-2023-24-Papers

#人脸识别#CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included...

action-recognitionautonomous-drivingbiometrics机器视觉cvprcvpr2023数据集深度学习face-recognitiongesture-recognitionimage-synthesismedical-image-processingmulti-modal-learningpattern-recognitionsegmentationself-supervised-learningvideo-synthesiscvpr2024
Python 451
1 年前
https://static.github-zh.com/github_avatars/zjukg?size=40
zjukg / KG-MM-Survey

Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

cross-modal-retrievalEntity resolutionimage-classificationimage-generationinformation-extractionknowledge-graphknowledge-graph-embeddingslarge-language-modelsmulti-modal-learningpaper-listsurveysurveysvisual-question-answeringawsome
425
6 个月前
https://static.github-zh.com/github_avatars/zhengli97?size=40
zhengli97 / PromptKD

[CVPR 2024] Official PyTorch Code for "PromptKD: Unsupervised Prompt Distillation for Vision-Language Models"

cvpr2024multi-modal-learningprompt-learningvision-language-modelknowledge-distillationclip
Python 315
6 天前
https://static.github-zh.com/github_avatars/Ysz2022?size=40
Ysz2022 / NeRCo

[ICCV 2023] Implicit Neural Representation for Cooperative Low-light Image Enhancement

neural-representationmulti-modal-learningiccviccv2023
Python 248
1 年前
https://static.github-zh.com/github_avatars/moabarar?size=40
moabarar / nemar

#计算机科学#[CVPR2020] Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation

multimodalimage-to-image-translationmulti-modalmulti-modal-learningaffine-transformation深度学习cnnPyTorchimage-registrationcvpr2020
Python 184
5 年前
https://static.github-zh.com/github_avatars/huggingface?size=40
huggingface / chug

#数据仓库#Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.

机器视觉数据集distributed-trainingdocument-understandingmulti-modal-learningpdf-document
Python 157
1 年前
https://static.github-zh.com/github_avatars/GuanRunwei?size=40
GuanRunwei / Achelous

The official repository of Achelous and Achelous++

multi-modal-learningmulti-task-learningobject-detectionobject-trackingpoint-cloud-segmentationsemantic-segmentation
Python 155
1 年前
https://static.github-zh.com/github_avatars/qizekun?size=40
qizekun / ReCon

[ICML 2023] Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining

Point cloudmulti-modal-learningrepresentation-learningself-supervised-learning
Python 145
1 年前
https://static.github-zh.com/github_avatars/wjun0830?size=40
wjun0830 / CGDETR

Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Grounding"

机器视觉detrmulti-modal-learningPyTorchvideo-understanding
Python 132
10 个月前
https://static.github-zh.com/github_avatars/kkakkkka?size=40
kkakkkka / ETRIS

#计算机科学#[ICCV-2023] The official code of Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation

深度学习深度神经网络机器学习multi-modal-learningsegmentation
Python 128
5 个月前
https://static.github-zh.com/github_avatars/shikras?size=40
shikras / d-cube

A detection/segmentation dataset with labels characterized by intricate and flexible expressions. "Described Object Detection: Liberating Object Detection with Flexible Expressions" (NeurIPS 2023).

multi-modal-learningobject-detectionreferring-expression-comprehensionvision-languagedatasetopen-vocabulary-detection
Python 125
1 年前
https://static.github-zh.com/github_avatars/924973292?size=40
924973292 / EDITOR

【CVPR2024】Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification

cvpr2024multi-modal-learningperson-reidreidmulti-modal
Python 104
8 个月前
loading...