video-understanding · GitHub Topics

#计算机科学#OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

action-recognition temporal-action-localization PyTorch video-understanding tsn i3d slowfast ava spatial-temporal-action-detection benchmark tsm non-local 深度学习 openmmlab video-classification

Python 4.75 k

1 年前

jinwchoi / awesome-action-recognition

#Awesome#A curated list of action recognition and related area resources

Awesome Lists action-recognition action-detection activity-recognition video-understanding video-recognition video-processing object-recognition pose-estimation

3.94 k

2 年前

OpenGVLab / Ask-Anything

#大语言模型#[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

captioning-videos ChatGPT gradio langchain video-question-answering video-understanding stablelm chat Video big-model foundation-models large-language-models

Python 3.3 k

8 个月前

mit-han-lab / temporal-shift-module

[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding

acceleration low-latency temporal-modeling video-understanding efficient-model nvidia-jetson-nano tsm

Python 2.14 k

1 年前

OpenGVLab / InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

foundation-models video-understanding vision-transformer action-recognition multimodal temporal-action-localization video-question-answering zero-shot-classification benchmark contrastive-learning self-supervised instruction-tuning video-clip

Python 2.05 k

1 个月前

open-mmlab / mmaction

An open-source toolbox for action understanding based on PyTorch

action-recognition action-detection video-understanding PyTorch temporal-action-detection temporal-action-localization spatial-temporal-action-detection

Python 1.87 k

3 年前

PaddlePaddle / PaddleVideo

基于模块化的设计，提供丰富的视频算法实现、产业级的视频算法优化与应用，包括安防、体育、互联网、媒体等行业的动作定位与识别、行为分析、智能封面、视频标注、视频打标签等，涵盖动作识别与视频分类、动作定位、动作检测、多模态文本视频检索等技术。

video-recognition tsm slowfast tsn bmn action-recognition youtube-8m kinetics400 video-understanding activitynet action-detection temporal-action-detection ava

Python 1.64 k

7 个月前

zai-org / GLM-V

GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

image2text video-understanding vlm reasoning

Python 1.63 k

3 天前

MCG-NJU / VideoMAE

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

self-supervised-learning action-recognition video-understanding transformer vision-transformer PyTorch video-analysis neurips-2022

Python 1.57 k

2 年前

yjxiong / temporal-segment-networks

Code & Models for Temporal Segment Networks (TSN) in ECCV 2016

action-recognition video-understanding

Python 1.56 k

5 年前

bytedance / SALMONN

SALMONN family: A suite of advanced multi-modal LLMs

audio audio-processing large-language-models multi-modal speech speech-recognition bytedance tsinghua-university music iclr2024 research Video video-understanding

Python 1.31 k

21 天前

TheShadow29 / awesome-grounding

#自然语言处理#awesome grounding: A curated list of research papers in visual grounding

机器视觉自然语言处理 grounding Awesome Lists papers arxiv video-understanding captioning-videos embodied-agent multimodal-deep-learning language-grounding Bukkit

1.11 k

2 个月前

yjxiong / tsn-pytorch

#计算机科学#Temporal Segment Networks (TSN) in PyTorch

action-recognition 深度学习 video-understanding PyTorch

Python 1.08 k

6 年前

PKU-YuanGroup / Chat-UniVi

[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

image-understanding large-language-models video-understanding vision-language-model

Python 932

1 年前

OpenGVLab / VideoMAEv2

[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking

cvpr2023 foundation-model self-supervised-learning video-understanding action-detection action-recognition temporal-action-detection

Python 679

1 年前

yjxiong / action-detection

temporal action detection with SSN

action-recognition action-detection video-understanding

Python 645

6 年前

cuixing158 / Awesome-CV-MasterHub

🔥 🔥 🔥 A paper list of some recent Computer Vision(CV) works

Awesome Lists image-captioning image-classification image-denoising image-enhancement image-generation keypoint-detection object-detection panoptic-segmentation pose-estimation video-generation video-understanding vision-transformer paper-list image-segmentation low-level-vision

639

3 天前

Vision-CAIR / MiniGPT4-video

Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding

video-question-answering video-understanding

Python 627

9 个月前

henghuiding / MeViS

[ICCV 2023] MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions

multimodal-learning referring-expression-comprehension referring-expression-segmentation referring-video-object-segmentation video-understanding

Python 519

1 个月前

yoosan / video-understanding-dataset

#数据仓库#A collection of recent video understanding datasets, under construction!

video-understanding 数据集机器视觉 action-recognition

465

7 年前