cross-modal · GitHub Topics

🪩 Create Disco Diffusion artworks in one line

creative-ai disco-diffusion cross-modal dalle generative-art multimodal diffusion prompts midjourney imgen clip-guided-diffusion latent-diffusion stable-diffusion

Python 3.84 k

2 年前

docarray / docarray

#计算机科学#Represent, send, store and search multimodal data

docarray 数据结构 multimodal cross-modal neural-search 深度学习 nested-data qdrant weaviate nearest-neighbor-search protobuf elasticsearch multi-modal semantic-search 机器学习 PyTorch FastAPI pydantic

Python 3.1 k

3 个月前

shaoxiongji / knowledge-graphs

#自然语言处理#A collection of research on knowledge graphs

knowledge-graph representation-learning relation-extraction reasoning 自然语言处理 ner commonsense cross-modal question-answering dialogue-systems information-retrieval survey Bukkit

JavaScript 1.76 k

3 年前

krantiparida / awesome-audio-visual

#Awesome#A curated list of different papers and datasets in various areas of audio-visual processing

Awesome Lists cross-modal Localization (l10n)source-separation

745

2 年前

kuanghuei / SCAN

#计算机科学#PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)

cross-modal image-captioning 神经网络深度学习 PyTorch 机器视觉

Python 568

2 年前

towhee-io / examples

#自然语言处理#Analyze the unstructured data with Towhee, such as reverse image search, reverse video search, audio classification, question and answer systems, molecular search, etc.

audio-classification cross-modal embeddings image-classification 机器学习自然语言处理

Jupyter Notebook 515

2 年前

yisun98 / SOLC

Remote Sensing Sar-Optical Land-use Classfication Pytorch Pytorch高分辨率遥感语义分割/地物分割/地物分类

PyTorch remote-sensing segmentation deeplabv3 cross-modal multi-modal

Python 241

1 年前

JizhiziLi / RIM

[CVPR 2023] Referring Image Matting

cross-modal image-matting image-segmentation multimodal matting

208

2 年前

DRSY / MoTIS

#向量搜索引擎#[NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)

iOS 人工智能 image-search clip vector-search knn lsh semantic-search knowledge-distillation retrieval cross-modal naacl

Swift 125

2 年前

QizhiPei / BioT5

#自然语言处理#BioT5 (EMNLP 2023) and BioT5+ (ACL 2024 Findings)

Bioinformatics computational-biology cross-modal 机器学习自然语言处理 nlp-applications

Python 117

1 年前

Zengyi-Qin / Weakly-Supervised-3D-Object-Detection

Weakly Supervised 3D Object Detection from Point Clouds (VS3D), ACM MM 2020

3d-object-detection kitti Point cloud cross-modal transfer-learning Tensorflow lidar monocular stereo unsupervised-learning

Jupyter Notebook 108

2 年前

qcraftai / distill-bev

DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation (ICCV 2023)

3d-object-detection knowledge-distillation lidar nuscenes Point cloud self-driving autonomous-driving distillation cross-modal multi-modal

Python 106

2 年前

yangli18 / VLTVG

Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning, CVPR 2022

vision-language cross-modal

Python 95

3 年前

rohitrango / objects-that-sound

#计算机科学#Unofficial Implementation of Google Deepmind's paper `Objects that Sound`

机器学习深度学习 audio-video embeddings 深度神经网络 deepmind cross-modal

Python 83

7 年前

marslanm / Multimodality-Representation-Learning

This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl....

cross-modal multimodal-datasets multimodal-deep-learning multimodal-pre-trained-model transformer-models vision-language-pretraining

3 个月前

kywen1119 / DSRAN

Code for journal paper "Learning Dual Semantic Relations with Graph Attention for Image-Text Matching", TCSVT, 2020.

PyTorch image-text-matching cross-modal 机器视觉

Python 73

3 年前

Paranioar / UniPT

[CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"

cross-modal parameter-efficient-learning parameter-efficient-tuning transfer-learning parameter-efficient-fine-tuning

Python 67

1 年前

QiangChunyu / SecoustiCodec

Ultra-low bitrate speech codec (0.27-1 kbps) with cross-modal alignment and real-time capabilities

contrastive-learning cross-modal semantic vae codec speaker speech speech-representation

Python 64

19 天前

Eaphan / UPIDet

Unleash the Potential of Image Branch for Cross-modal 3D Object Detection [NeurIPS2023]

3d-object-detection cross-modal multi-modal

Python 62

1 年前

GT-RIPL / Xmodal-Ctx

Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning

clip cross-modal image-captioning vision-and-language

Python 60

3 年前