#计算机科学#OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
#Awesome#A curated list of action recognition and related area resources
#大语言模型#[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
An open-source toolbox for action understanding based on PyTorch
基于模块化的设计,提供丰富的视频算法实现、产业级的视频算法优化与应用,包括安防、体育、互联网、媒体等行业的动作定位与识别、行为分析、智能封面、视频标注、视频打标签等,涵盖动作识别与视频分类、动作定位、动作检测、多模态文本视频检索等技术。
Code & Models for Temporal Segment Networks (TSN) in ECCV 2016
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
SALMONN family: A suite of advanced multi-modal LLMs
#自然语言处理#awesome grounding: A curated list of research papers in visual grounding
#计算机科学#Temporal Segment Networks (TSN) in PyTorch
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
temporal action detection with SSN
Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
🔥 🔥 🔥 A paper list of some recent Computer Vision(CV) works
#安卓#A lightning-fast, cross-platform AI chat application built with React Native.
[ICCV 2023] MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions