vit · GitHub Topics

#计算机科学#pix2tex: Using a ViT to convert images of equations into LaTeX code.

机器学习 transformer im2latex 深度学习 image2text LaTeX dataset PyTorch im2markup OCR latex-ocr vit math-ocr vision-transformer 图像处理 Python im2text

Python 15.08 k

6 个月前

cmhungsteve / Awesome-Transformer-Attention

#Awesome#An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites

transformer attention-mechanism vision-transformer 深度学习 Awesome Lists transformer-cv transformer-architecture transformer-awesome transformer-with-cv transformer-models visual-transformer 机器视觉 papers attention-mechanisms self-attention vit detr transformers

4.91 k

1 年前

towhee-io / towhee

#大语言模型#Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

机器学习 convolutional-networks embedding-vectors embeddings 机器视觉图像处理 video-processing feature-extraction image-retrieval unstructured-data feature-vector transformer milvus vision-transformer vit pipeline 大语言模型

Python 3.39 k

9 个月前

open-compass / VLMEvalKit

#大语言模型#Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

gpt-4v large-language-models llava multi-modal openai vqa 大语言模型 openai-api qwen gpt 机器视觉 PyTorch gpt4 ChatGPT clip vit evaluation claude gemini

Python 2.82 k

1 天前

thu-ml / SageAttention

#大语言模型#Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.

attention 大语言模型 quantization CUDA triton video-generation mlsys vit

Cuda 2.11 k

10 天前

hila-chefer / Transformer-Explainability

#计算机科学#[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.

深度学习 vision-transformer bert-model bert explainability vit cvpr2021

Jupyter Notebook 1.91 k

2 年前

roboflow / inference

#计算机科学#Turn any computer or edge device into a command center for your computer vision projects.

机器视觉 inference-api inference-server vit yolov5 yolov8 jetson tensorrt classification instance-segmentation object-detection onnx 部署 Docker inference 机器学习 Python yolo11 agents

Python 1.82 k

13 小时前

Yangzhangcst / Transformer-in-Computer-Vision

#计算机科学#A paper list of some recent Transformer-based CV works.

transformer transformer-cv transformer-awesome detr vit Awesome Lists 机器视觉深度学习 papers

1.34 k

10 小时前

BR-IDL / PaddleViT

#计算机科学#:robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

cv 机器视觉 paddlepaddle vit mlp transformer encoder-decoder classification detection segmentation Generative Adversarial Network 深度学习 semantic-segmentation object-detection

Python 1.24 k

3 年前

yitu-opensource / T2T-ViT

ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

vision-transformer vit

Jupyter Notebook 1.19 k

2 年前

sail-sg / Adan

#计算机科学#Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

bert-model convnext 深度学习 fairseq optimizer resnet timm vit transformer-xl 人工智能 diffusion dreamfusion gpt2 PyTorch cuda-programming llm-training 大语言模型 moe

Python 797

2 个月前

thu-ml / SpargeAttn

#大语言模型#SpargeAttention: A training-free sparse attention that can accelerate any model inference.

ai-infra attention 大语言模型 mlsys quantization vision-transformer video-generation vit

Cuda 664

1 天前

v-iashin / video_features

Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.

PyTorch feature-extraction parallel audio-features i3d resnet raft optical-flow clip timm vit

Python 609

6 个月前

chinhsuanwu / mobilevit-pytorch

A PyTorch implementation of "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer"

vit mobilenetv2 vision-transformer

Python 538

4 年前

zgcr / SimpleAICV_pytorch_training_examples

SimpleAICV:pytorch training and testing examples.

PyTorch resnet vit van detr fcos retinanet deeplabv3plus solov2 yolact dbnet sam segment-anything

Jupyter Notebook 433

22 天前

eeyhsong / EEG-Transformer

#计算机科学#i. A practical application of Transformer (ViT) on 2-D physiological signal (EEG) classification tasks. Also could be tried with EMG, EOG, ECG, etc. ii. Including the attention of spatial dimension (c...

深度学习 attention-mechanism vit transformer attention EEG eeg-classification

Python 303

2 年前

gupta-abhay / pytorch-vit

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

image-recognition transformers image-classification vit vision-transformer

Python 297

4 年前

vatz88 / FFCSonTheGo

FFCS course registration made hassle free for VITians. Search courses and visualize the timetable on the go!

vit vellore ffcs timetable Hacktoberfest JavaScript

JavaScript 296

1 个月前

PaddlePaddle / PASSL

#计算机科学#PASSL包含 SimCLR，MoCo v1/v2，BYOL，CLIP，PixPro，simsiam, SwAV, BEiT，MAE 等图像自监督算法以及 Vision Transformer，DEiT，Swin Transformer，CvT，T2T-ViT，MLP-Mixer，XCiT，ConvNeXt，PVTv2 等基础视觉算法

深度学习 moco simclr clip self-supervised-learning paddle swin-transformer vision-transformer beit convnext vit deit pvt swav

Python 285

2 年前

tue-mps / eomt

[CVPR 2025 Highlight] Official code and models for Encoder-only Mask Transformer (EoMT).

image-segmentation instance-segmentation panoptic-segmentation segmentation transformers vision-transformer vit

Jupyter Notebook 280

2 天前