GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

image-text-matching

Website
Wikipedia
https://static.github-zh.com/github_avatars/NVlabs?size=40
NVlabs / GroupViT

Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges from Text Supervision, CVPR 2022.

image-text-matchingtransformerszero-shot-learningsemantic-segmentation
Python 761
3 年前
https://static.github-zh.com/github_avatars/slavabarkov?size=40
slavabarkov / tidy

#自然语言处理#Offline semantic Text-to-Image and Image-to-Image search on Android powered by quantized state-of-the-art vision-language pretrained CLIP model and ONNX Runtime inference engine

Androidclip机器视觉深度学习image-retrievalKotlin自然语言处理onnxquantizationimage-text-retrievalcross-modal-retrievalimage-text-matchingimage-searchsemantic-search
Kotlin 442
1 年前
https://static.github-zh.com/github_avatars/Paranioar?size=40
Paranioar / Awesome_Matching_Pretraining_Transfering

#Awesome#The Paper List of Large Multi-Modality Model (Perception, Generation, Unification), Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insigh...

cross-modal-retrieval教程Awesome Listsimage-text-matchingimage-text-retrievallarge-language-modelslarge-vision-language-modelsmultimodal-pretrainingparameter-efficient-fine-tuningvision-and-languagemultimodal-large-language-models大语言模型text-to-image-generationtext-to-image-synthesistext-to-video-generation
423
6 个月前
https://static.github-zh.com/github_avatars/Paranioar?size=40
Paranioar / SGRAF

[AAAI2021] The code of “Similarity Reasoning and Filtration for Image-Text Matching”

cross-modal-retrievalimage-text-matchingimage-retrievalimage-text-retrievaltext-matchingaaai
Python 215
1 年前
https://static.github-zh.com/github_avatars/woodfrog?size=40
woodfrog / vse_infty

Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021 (Oral)

image-text-matchingcross-modal-retrievalvision-languagePyTorch
Python 161
2 年前
https://static.github-zh.com/github_avatars/kywen1119?size=40
kywen1119 / DSRAN

Code for journal paper "Learning Dual Semantic Relations with Graph Attention for Image-Text Matching", TCSVT, 2020.

PyTorchimage-text-matchingcross-modal机器视觉
Python 72
3 年前
https://static.github-zh.com/github_avatars/naver-ai?size=40
naver-ai / eccv-caption

#计算机科学#Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)

cross-modal-retrievaldataset深度学习eccv2022evaluationimage-text-matching机器学习vision-and-language
Python 56
1 年前
https://static.github-zh.com/github_avatars/eric-ai-lab?size=40
eric-ai-lab / ComCLIP

Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"

blip2causalityclipcompositionalityimage-text-matchingimage-text-retrievalvision-and-language
Python 35
10 个月前
https://static.github-zh.com/github_avatars/weiyx16?size=40
weiyx16 / CLIP-pytorch

A non-JIT version implementation / replication of CLIP of OpenAI in pytorch

clipPyTorchimage-text-matching
Python 34
4 年前
https://static.github-zh.com/github_avatars/jaisidhsingh?size=40
jaisidhsingh / LoRA-CLIP

Easy wrapper for inserting LoRA layers in CLIP.

image-text-matchingloramultimodalmultimodal-deep-learningparameter-efficient-tuningvision-language-pretraining
Python 33
1 年前
https://static.github-zh.com/github_avatars/Paranioar?size=40
Paranioar / RCAR

[TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”

cross-modal-retrievalimage-text-matchingimage-retrievalimage-text-retrievaltext-matchingtip
Python 33
1 年前
https://static.github-zh.com/github_avatars/jaisidhsingh?size=40
jaisidhsingh / CoN-CLIP

#计算机科学#Implementation of the "Learn No to Say Yes Better" paper.

compositionality深度学习image-text-matchingmultimodalPyTorchvisual-language-models
Python 31
19 天前
https://static.github-zh.com/github_avatars/MartinYuanNJU?size=40
MartinYuanNJU / SEMScene

Code implementation of paper "SEMScene: Semantic-Consistency Enhanced Multi-Level Scene Graph Matching for Image-Text Retrieval".

image-text-matchingcross-modal-retrieval
Python 25
7 个月前
https://static.github-zh.com/github_avatars/JinhaoLee?size=40
JinhaoLee / WCA

#计算机科学#[ICML 2024] Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models

vision-language-model深度学习image-text-matchinglarge-language-modelsvisual-promptingzero-shot-classification
Python 17
9 个月前
https://static.github-zh.com/github_avatars/alipay?size=40
alipay / PC2-NoiseofWeb

Noise of Web (NoW) is a challenging noisy correspondence learning (NCL) benchmark containing 100K image-text pairs for robust image-text matching/retrieval models.

benchmarkcross-modal-retrievaldatasetimage-text-matchingimage-text-retrievalmultimodal-learning
Python 12
7 个月前
https://static.github-zh.com/github_avatars/nhtlongcs?size=40
nhtlongcs / AIC2022-VER

Text Query based Traffic Video Event Retrieval with Global-Local Fusion Embedding

image-text-matchingretrievalPyTorchpytorch-lightning
Python 12
2 年前
https://static.github-zh.com/github_avatars/zabir-nabil?size=40
zabir-nabil / bangla-image-search

#搜索#A dead-simple image search / retrieval and image-text matching system for Bangla using CLIP

clip深度学习image-searchimage-search-enginesearch搜索引擎image-retrievalimage-text-matching
Python 12
2 年前
https://static.github-zh.com/github_avatars/zabir-nabil?size=40
zabir-nabil / bangla-CLIP

CLIP (Contrastive Language–Image Pre-training) for Bangla.

clipimage-retrievalimage-text-matching
Python 10
1 年前
https://static.github-zh.com/github_avatars/Paranioar?size=40
Paranioar / DBL

[TIP2024] The code of “Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text Matching”

cross-modal-retrievalimage-retrievalimage-text-matchingimage-text-retrievaltext-matchingtip
Python 9
1 年前
https://static.github-zh.com/github_avatars/kaylode?size=40
kaylode / tern

Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU

cross-modal-retrievalimage-text-retrievaltransformerimage-text-matching
Jupyter Notebook 8
3 年前
loading...