🪩 Create Disco Diffusion artworks in one line
#计算机科学#Represent, send, store and search multimodal data
#自然语言处理#A collection of research on knowledge graphs
#Awesome#A curated list of different papers and datasets in various areas of audio-visual processing
#计算机科学#PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)
#自然语言处理#Analyze the unstructured data with Towhee, such as reverse image search, reverse video search, audio classification, question and answer systems, molecular search, etc.
Remote Sensing Sar-Optical Land-use Classfication Pytorch Pytorch高分辨率遥感语义分割/地物分割/地物分类
[CVPR 2023] Referring Image Matting
#向量搜索引擎#[NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)
#自然语言处理#BioT5 (EMNLP 2023) and BioT5+ (ACL 2024 Findings)
Weakly Supervised 3D Object Detection from Point Clouds (VS3D), ACM MM 2020
DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation (ICCV 2023)
Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning, CVPR 2022
#计算机科学#Unofficial Implementation of Google Deepmind's paper `Objects that Sound`
This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl....
Code for journal paper "Learning Dual Semantic Relations with Graph Attention for Image-Text Matching", TCSVT, 2020.
[CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"
Unleash the Potential of Image Branch for Cross-modal 3D Object Detection [NeurIPS2023]
Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning
[Paper][AAAI 2023] DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning