视频理解综述

CVPR2025

CVPR 2025 Accepted Papers

CVPR25 Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Video-MME

https://github.com/BradyFU/Video-MME?tab=readme-ov-file

1、Awesome-LLMs-for-Video-Understanding

https://github.com/yunlong10/Awesome-LLMs-for-Video-Understanding

https://arxiv.org/pdf/2312.17432v4 (2407修订版)

From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding

https://arxiv.org/pdf/2409.18938

https://github.com/Vincent-ZHQ/LV-LLMs

paperwithcode Video Understanding

https://paperswithcode.com/task/video-understanding/latest

Awesome-Multimodal-Large-Language-Models

https://github.com/yfzhang114/Awesome-Multimodal-Large-Language-Models?tab=readme-ov-file

MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs

https://arxiv.org/pdf/2411.15296

万字长文总结多模态大模型评估最新进展 - yearn的文章 - 知乎

https://zhuanlan.zhihu.com/p/16815782175

另一个Awesome-Multimodal-Large-Language-Models

https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models?tab=readme-ov-file

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

多模态学习有什么好的研究方向? - 梦想成真的回答 - 知乎

https://www.zhihu.com/question/332876504/answer/130142183129