GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

captioning

Website
Wikipedia
https://static.github-zh.com/github_avatars/facebookresearch?size=40
facebookresearch / mmf

#计算机科学#A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

PyTorchvqapretrained-modelsmultimodal深度学习captioningdialogtextvqahateful-memesmulti-tasking
Python 5.58 k
3 个月前
https://static.github-zh.com/github_avatars/roboflow?size=40
roboflow / maestro

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

captioningfine-tuningflorence-2multimodalobjectdetectionpaligemmaphi-3-visiontransformersvision-and-languagevqaqwen2-vl
Python 2.6 k
3 天前
https://static.github-zh.com/github_avatars/fpgaminer?size=40
fpgaminer / joycaption

JoyCaption is an image captioning Visual Language Model (VLM) being built from the ground up as a free, open, and uncensored model for the community to use in training Diffusion models.

captioningvlm
Python 725
6 天前
https://static.github-zh.com/github_avatars/ltguo19?size=40
ltguo19 / VSUA-Captioning

#自然语言处理#Code for "Aligning Linguistic Words and Visual Semantic Units for Image Captioning", ACM MM 2019

captioninglanguage-generation深度学习PyTorch自然语言处理
Python 257
6 年前
https://static.github-zh.com/github_avatars/DavidHuji?size=40
DavidHuji / CapDec

CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)

captioningclipgpt-2multimodal-deep-learningzero-shot-learning
Python 198
2 年前
https://static.github-zh.com/github_avatars/Labbeti?size=40
Labbeti / aac-datasets

#数据仓库#Audio Captioning datasets for PyTorch.

PyTorchaudiocaption数据集captioningdataset深度学习
Python 121
13 天前
https://static.github-zh.com/github_avatars/HaydenFaulkner?size=40
HaydenFaulkner / Tennis

#计算机科学#A Tennis dataset and models for event detection & commentary generation

机器学习机器视觉datasetfine-grainedcaptioningVideomxnetgluon
Python 100
1 个月前
https://static.github-zh.com/github_avatars/mitvis?size=40
mitvis / vistext

VisText is a benchmark dataset for semantically rich chart captioning.

captioningchartsdatasett5
Jupyter Notebook 94
2 年前
https://static.github-zh.com/github_avatars/drethage?size=40
drethage / fully-convolutional-point-network

#计算机科学#Fully-Convolutional Point Networks for Large-Scale Point Clouds

机器视觉3Dsemantic-segmentation深度学习深度神经网络captioningPoint cloudmeshes
Python 85
6 年前
https://static.github-zh.com/github_avatars/Mauville?size=40
Mauville / MedCLIP

#计算机科学#Medical image captioning using OpenAI's CLIP

深度学习clipcaptioning机器学习Medical imaging
Jupyter Notebook 83
2 年前
https://static.github-zh.com/github_avatars/audio-captioning?size=40
audio-captioning / clotho-dataset

#自然语言处理#Python code for handling the Clotho dataset.

audio深度学习自然语言处理captioning
Python 81
5 年前
https://static.github-zh.com/github_avatars/wangleihitcs?size=40
wangleihitcs / MedicalReportGeneration

A Base Tensorflow Project for Medical Report Generation

Tensorflowcaptioning
Python 70
6 年前
https://static.github-zh.com/github_avatars/ParitoshParmar?size=40
ParitoshParmar / MTL-AQA

What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment [CVPR 2019]

multitask-learningvideo-understandingvideo-processingvideo-captioningPyTorchaction-recognitionrepresentation-learninglstmcaptioning
Python 68
3 个月前
https://static.github-zh.com/github_avatars/aimagelab?size=40
aimagelab / pacscore

[CVPR 2023 & IJCV 2025] Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation

captioningcaptioning-videos机器视觉cvprcvpr2023vision-and-language
Python 62
2 天前
https://static.github-zh.com/github_avatars/42lux?size=40
42lux / CaptainCaption

A gradio based image captioning tool that uses the GPT-4-Vision API to generate detailed descriptions of images.

captioninggpt-4-visiongradioopenai-apitagging
Python 60
8 个月前
https://static.github-zh.com/github_avatars/TheShadow29?size=40
TheShadow29 / VidSitu

#自然语言处理#[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)

visionvision-and-languagegrounding自然语言处理Videosrlcaptioning-videoscaptioning
Python 60
4 年前
https://static.github-zh.com/github_avatars/Labbeti?size=40
Labbeti / aac-metrics

Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.

audiocaptioning监控text
Python 54
12 天前
https://static.github-zh.com/github_avatars/lucidrains?size=40
lucidrains / AoA-pytorch

A Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering

attentionattention-mechanismvqavisual-question-answeringcaptioning
Python 43
5 年前
https://static.github-zh.com/github_avatars/DavidMChan?size=40
DavidMChan / caption-by-committee

#大语言模型#Using LLMs and pre-trained caption models for super-human performance on image captioning.

人工智能captioningChatGPT深度学习Image机器学习Python
Python 42
2 年前
https://static.github-zh.com/github_avatars/audio-captioning?size=40
audio-captioning / dcase-2020-baseline

#计算机科学#Audio captioning baseline system for DCASE 2020 challenge.

captioning深度学习深度神经网络机器学习signal-processing
Python 38
2 年前
loading...