Loading

该仓库已收录但尚未编辑。项目介绍及使用教程请前往 GitHub 阅读 README


0 条讨论

登录后发表评论

关于

[TMM 2023] VideoXum: Cross-modal Visual and Textural Summarization of Videos

创建时间
是否国产

  修改时间

2024-04-09T06:47:09Z


语言

  • Python100.0%

您可能感兴趣的

Training for Golang (go language)

Go10.28 k
1 年前

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python7.05 k
1 年前

Official Repository of ChatCaptioner

Jupyter Notebook462
2 年前

This repository includes the official project of TransUNet, presented in our paper: TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation.

Python2.91 k
1 年前

[AAAI 2023 Oral] VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning

Jupyter Notebook68
2 年前
Go43.96 k
1 年前🇨🇳

30天自制C++服务器,包含教程和源代码

C++6.85 k
6 个月前🇨🇳

Video Feature Extraction Code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"

Python114
4 年前

Video Summarization Dataset, Papers, Codes

172
7 年前

G-code generator for 3D printers (Bambu, Prusa, Voron, VzBot, RatRig, Creality, etc.)

C++11.01 k
10 小时前

official UI5 end-to-end test framework for UI5 web-apps. wdi5 = Webdriver.IO + UI5 Test API

TypeScript109
13 天前

Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.

Python617
8 个月前

#大语言模型#Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python3.11 k
3 天前

🔥🔥🔥 [IEEE TCSVT] Latest Papers, Codes and Datasets on Vid-LLMs.

2.77 k
1 个月前