This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
#计算机科学#CVNets: A library for training computer vision networks
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
This is an official implementation for "Contextual Transformer Networks for Visual Recognition".
This repository contains the source code of our work on designing efficient CNNs for computer vision
VarifocalNet: An IoU-aware Dense Object Detector
#计算机科学#The official repo for [NeurIPS'21] "ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias" and [IJCV'22] "ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image ...
Official ImageNet Model repository
SWA Object Detection
#计算机科学#Video Platform for Action Recognition and Object Detection in Pytorch
[ECCV 2020] Boundary-preserving Mask R-CNN
High-resolution Networks for the Fully Convolutional One-Stage Object Detection (FCOS) algorithm
#自然语言处理#generate captions for images using a CNN-RNN model that is trained on the Microsoft Common Objects in COntext (MS COCO) dataset
A tensorflow implement mobilenetv3 centernet, which can be easily deployeed on android(MNN) and ios(CoreML).
#数据仓库#A repository and interchange format for weed identification annotation
A tool for converting computer vision label formats.
Adds SPICE metric to coco-caption evaluation server codes
Implementation of models in our EMNLP 2019 paper: A Logic-Driven Framework for Consistency of Neural Models