GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

multimodal-datasets

Website
Wikipedia
https://static.github-zh.com/github_avatars/salesforce?size=40
salesforce / LAVIS

#计算机科学#LAVIS - A One-stop Library for Language-Vision Intelligence

深度学习deep-learning-libraryimage-captioningsalesforcevision-and-languagevision-frameworkvision-language-pretrainingvision-language-transformervisual-question-anwseringmultimodal-datasetsmultimodal-deep-learning
Jupyter Notebook 10.63 k
7 个月前
https://static.github-zh.com/github_avatars/remyxai?size=40
remyxai / VQASynth

Compose multimodal datasets 🎹

multimodal-datasetsmultimodal-deep-learningsynthetic-dataset-generation
Python 403
6 天前
https://static.github-zh.com/github_avatars/drmuskangarg?size=40
drmuskangarg / Multimodal-datasets

This repository is build in association with our position paper on "Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers". As a part of this release we share the information...

multimodal-datasets
296
3 年前
https://static.github-zh.com/github_avatars/AnkurDeria?size=40
AnkurDeria / MFT

#计算机科学#Pytorch implementation of Multimodal Fusion Transformer for Remote Sensing Image Classification.

深度学习multimodal-datasetsmultimodal-deep-learningremote-sensingtransformer-models
Jupyter Notebook 212
1 年前
https://static.github-zh.com/github_avatars/wisdomikezogwo?size=40
wisdomikezogwo / quilt1m

[NeurIPS 2023 Oral] Quilt-1M: One Million Image-Text Pairs for Histopathology.

clip-modelhistopathologymultimodal-datasetsvlm
Python 160
1 年前
https://static.github-zh.com/github_avatars/yuanxiaosc?size=40
yuanxiaosc / Multimodal-short-video-dataset-and-baseline-classification-model

500,000 multimodal short video data and baseline models. 50万条多模态短视频数据集和基线模型(TensorFlow2.0)。

multimodal-datasetsclassification-modelTensorflow
Jupyter Notebook 128
6 年前
https://static.github-zh.com/github_avatars/marslanm?size=40
marslanm / Multimodality-Representation-Learning

This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl....

cross-modalmultimodal-datasetsmultimodal-deep-learningmultimodal-pre-trained-modeltransformer-modelsvision-language-pretraining
75
2 年前
https://static.github-zh.com/github_avatars/roboflow?size=40
roboflow / rf100-vl

Code from the paper "Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models"

机器视觉multimodal-datasetsobject-detection
Python 62
14 天前
https://static.github-zh.com/github_avatars/piresramon?size=40
piresramon / gpt-4-enem

Code and data to evaluate LLMs on the ENEM, the main standardized Brazilian university admission exams.

人工智能llm-inference大语言模型multimodal-datasets
Python 47
6 个月前
https://static.github-zh.com/github_avatars/Yuco-Z?size=40
Yuco-Z / Awesome-Multi-Modal-Dialog

#Awesome#[Paperlist] Awesome paper list of multimodal dialog, including methods, datasets and metrics

Awesome Listsdialoguemultimodalmultimodal-deep-learningmultimodal-datasetsmultimodal-learning
39
5 个月前
https://static.github-zh.com/github_avatars/JunweiLiang?size=40
JunweiLiang / FVTA_MemexQA

Real-world photo sequence question answering system (MemexQA). CVPR'18 and TPAMI'19

visual-question-answeringvision-and-languagemultimodal-deep-learningmultimodal-datasets
Python 32
6 年前
https://static.github-zh.com/github_avatars/OlehOnyshchak?size=40
OlehOnyshchak / pyWikiMM

Collects a multimodal dataset of Wikipedia articles and their images

wikipediamultimodalmultimodalitymultimodal-datasetsmultimodal-learning数据库data-cleaningdata-collectiondata-processing
Python 16
2 年前
https://static.github-zh.com/github_avatars/ddw2AIGROUP2CQUPT?size=40
ddw2AIGROUP2CQUPT / Large-Scale-Multimodal-Face-Datasets

Millions-Level Face/Human-Scene Image-Text Datasets

multimodal-datasets
15
12 天前
https://static.github-zh.com/github_avatars/deepmancer?size=40
deepmancer / vlm-toolbox

#计算机科学#Vision-Language Models Toolbox: Your all-in-one solution for multimodal research and experimentation

clip深度学习deep-learning-librarymultimodal-datasetsmultimodal-deep-learningmultimodal-learningprompt-tuningvision-and-languagevision-frameworkvision-language-transformerzero-shot-classificationPyTorchtransformers
Jupyter Notebook 10
4 个月前
https://static.github-zh.com/github_avatars/lujiaying?size=40
lujiaying / MUG-Bench

Data and code of the Findings of EMNLP'23 paper MuG: A Multimodal Classification Benchmark on Game Data with Tabular, Textual, and Visual Fields

multimodal-datasetsmultimodal-learning
Python 9
1 年前
https://static.github-zh.com/github_avatars/NUSTM?size=40
NUSTM / EMDRC

#数据仓库#Towards Explainable Multimodal Depression Recognition for Clinical Interviews

mental-healthdataset数据集affective-computingmultimodal-datasets
7
5 个月前
https://static.github-zh.com/github_avatars/gcunhase?size=40
gcunhase / AnnotatedMV-PreProcessing

Pre-Processing of Annotated Music Video Corpora (COGNIMUSE and DEAP)

multimodal-datasets
Python 5
4 年前
https://static.github-zh.com/github_avatars/clp-research?size=40
clp-research / language-models-multimodal-tasks

Official Git repository for "Hakimov, S., and Schlangen, D., (2023). Images in Language Space: Exploring the Suitability of Large Language Models for Vision & Language Tasks. Findings of the Associati...

language-modelmultimodal-datasetsmultimodal-learning
Python 3
2 年前
https://static.github-zh.com/github_avatars/OlehOnyshchak?size=40
OlehOnyshchak / WikiImageRecommendation

Image Recommendation for Wikipedia Articles

wikipediamultimodal-learningmultimodal-deep-learningmultimodal-datasetstextImagerecommender-systemsdata-collection
Jupyter Notebook 3
4 年前
https://static.github-zh.com/github_avatars/GeorgeTouros?size=40
GeorgeTouros / video-soundtrack-evaluation

#计算机科学#Create a large, well-managed and clean data-set for the task of music composition for video soundtracks.

深度学习multimodal-deep-learningmultimodal-datasetsVideomusic-composition
Jupyter Notebook 3
2 年前
loading...