GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

video-language-pretraining

Website
Wikipedia
https://static.github-zh.com/github_avatars/DAMO-NLP-SG?size=40
DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

large-language-modelsvideo-language-pretrainingvision-language-pretrainingblip2llamaminigpt4cross-modal-pretrainingmulti-modal-chatgpt
Python 3.05 k
1 年前
https://static.github-zh.com/github_avatars/bytedance?size=40
bytedance / Shot2Story

A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.

benchmarkdatasetlarge-language-modelsvideo-language-pretrainingvideo-question-answeringvision-languagevideo-captioningresearch
Python 147
6 个月前
https://static.github-zh.com/github_avatars/XLearning-SCU?size=40
XLearning-SCU / 2024-ICLR-Norton

Multi-granularity Correspondence Learning from Long-term Noisy Videos [ICLR 2024, Oral]

video-language-pretraining
Python 116
1 年前
https://static.github-zh.com/github_avatars/bigai-nlco?size=40
bigai-nlco / VideoLLaMB

[ICCV 2025] Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges

long-contextvideo-language-pretraining
Python 71
5 个月前
https://static.github-zh.com/github_avatars/liveseongho?size=40
liveseongho / Awesome-Video-Language-Understanding

#计算机科学#A Survey on video and language understanding.

深度学习机器学习multimodal-deep-learningvideo-language-pretrainingdatasetBukkit
50
2 年前
https://static.github-zh.com/github_avatars/SCZwangxiao?size=40
SCZwangxiao / RTQ-MM2023

#计算机科学#ACM Multimedia 2023 (Oral) - RTQ: Rethinking Video-language Understanding Based on Image-text Model

深度学习foundational-models机器学习multi-modalvideo-language-pretrainingvideo-understandingvision-and-language
Python 16
2 年前
https://static.github-zh.com/github_avatars/Maddy12?size=40
Maddy12 / SSL4VideoSurvey

The official GitHub page for the survey paper "Self-Supervised learning for Videos: A survey"

action-recognition机器视觉pre-trainingtext-to-videovideo-language-pretraining
8
2 年前