GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

vqa

Website
Wikipedia
https://static.github-zh.com/github_avatars/facebookresearch?size=40
facebookresearch / mmf

#计算机科学#A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

PyTorchvqapretrained-modelsmultimodal深度学习captioningdialogtextvqahateful-memesmulti-tasking
Python 5.57 k
2 个月前
https://static.github-zh.com/github_avatars/OpenGVLab?size=40
OpenGVLab / InternGPT

#大语言模型#InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, ...

ChatGPTfoundation-modelgptgpt-4gradiohuskyimage-captioninglangchain大语言模型multimodalvqallamavicunavideo-generationsamsegment-anythingclickdraggan
Python 3.21 k
10 个月前
https://static.github-zh.com/github_avatars/roboflow?size=40
roboflow / maestro

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

captioningfine-tuningflorence-2multimodalobjectdetectionpaligemmaphi-3-visiontransformersvision-and-languagevqaqwen2-vl
Python 2.57 k
6 天前
https://static.github-zh.com/github_avatars/open-compass?size=40
open-compass / VLMEvalKit

#大语言模型#Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

gpt-4vlarge-language-modelsllavamulti-modalopenaivqa大语言模型openai-apiqwengpt机器视觉PyTorchgpt4ChatGPTclipvitevaluationclaudegemini
Python 2.52 k
3 天前
https://static.github-zh.com/github_avatars/BDBC-KG-NLP?size=40
BDBC-KG-NLP / QA-Survey-CN

#自然语言处理#北京航空航天大学大数据高精尖中心自然语言处理研究团队开展了智能问答的研究与应用总结。包括基于知识图谱的问答(KBQA),基于文本的问答系统(TextQA),基于表格的问答系统(TableQA)、基于视觉的问答系统(VisualQA)和机器阅读理解(MRC)等,每类任务分别对学术界和工业界进行了相关总结。

survey自然语言处理question-answeringkbqavqaqa
1.79 k
2 年前
https://static.github-zh.com/github_avatars/peteanderson80?size=40
peteanderson80 / bottom-up-attention

Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

vqavisual-question-answeringfaster-rcnncaffeimage-captioningmscoco
Jupyter Notebook 1.45 k
2 年前
https://static.github-zh.com/github_avatars/NVlabs?size=40
NVlabs / prismer

The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".

image-captioninglanguage-modelmulti-modal-learningmulti-task-learningvision-language-modelvision-and-languagevqa
Python 1.31 k
1 年前
microsoft/Oscar
https://static.github-zh.com/github_avatars/microsoft?size=40
microsoft / Oscar

Oscar and VinVL

vision-and-languagepre-trainingimage-captioningvqaoscar
Python 1.05 k
2 年前
https://static.github-zh.com/github_avatars/hila-chefer?size=40
hila-chefer / Transformer-MM-Explainability

[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-bas...

transformerstransformervqadetr可视化explainabilityexplainable-aiinterpretabilityclip
Jupyter Notebook 853
2 年前
https://static.github-zh.com/github_avatars/hengyuan-hu?size=40
hengyuan-hu / bottom-up-attention-vqa

An efficient PyTorch implementation of the winning entry of the 2017 VQA Challenge.

vqaPyTorch
Python 757
1 年前
https://static.github-zh.com/github_avatars/Cadene?size=40
Cadene / vqa.pytorch

#计算机科学#Visual Question Answering in Pytorch

vqa深度学习resnetPyTorchcocotorch
Python 728
6 年前
https://static.github-zh.com/github_avatars/jayleicn?size=40
jayleicn / ClipBERT

[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.

PyTorchvideo-question-answeringvqavision-and-languagecvpr2021
Python 722
2 年前
https://static.github-zh.com/github_avatars/jokieleung?size=40
jokieleung / awesome-visual-question-answering

#Awesome#A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.

Awesome Listsvqamulti-modalmulti-modal-learning
662
2 年前
https://static.github-zh.com/github_avatars/OpenGVLab?size=40
OpenGVLab / Multi-Modality-Arena

#大语言模型#Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP...

chat聊天机器人ChatGPTgradiolarge-language-models大语言模型vqamulti-modalityvision-language-model
Python 529
1 年前
https://static.github-zh.com/github_avatars/stanfordnlp?size=40
stanfordnlp / mac-network

Implementation for the paper "Compositional Attention Networks for Machine Reasoning" (Hudson and Manning, ICLR 2018)

attentionTensorflowquestion-answeringvqa
Python 501
4 年前
https://static.github-zh.com/github_avatars/chingyaoc?size=40
chingyaoc / awesome-vqa

Visual Q&A reading list

vqapapersarxiv
437
7 年前
https://static.github-zh.com/github_avatars/vacancy?size=40
vacancy / NSCL-PyTorch-Release

PyTorch implementation for the Neuro-Symbolic Concept Learner (NS-CL).

vqa
Python 427
5 年前
https://static.github-zh.com/github_avatars/davidmascharka?size=40
davidmascharka / tbd-nets

#计算机科学#PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"

机器学习PyTorch可视化深度学习visual-question-answeringvqaneural-networks
Jupyter Notebook 348
4 年前
https://static.github-zh.com/github_avatars/MILVLG?size=40
MILVLG / openvqa

#计算机科学#A lightweight, scalable, and general framework for visual question answering research

visual-question-answeringvqaPyTorch深度学习benchmark
Python 323
4 年前
https://static.github-zh.com/github_avatars/abachaa?size=40
abachaa / Existing-Medical-QA-Datasets

#自然语言处理#Multimodal Question Answering in the Medical Domain: A summary of Existing Datasets and Systems

qaquestion-answering自然语言处理medical-informaticsvqa机器视觉数据集radiology
296
2 年前
loading...