--- license: apache-2.0 language: - en tags: - long-video - video-understanding - video-qa - agent - qwen2.5 - transformers - longtvqa pipeline_tag: text-generation base_model: Qwen/Qwen2.5-7B-Instruct library_name: transformers --- # LongVideoAgent Qwen2.5-7B This repository hosts the released LLM checkpoint for **LongVideoAgent**, a multi-agent framework for long-video question answering. This model is a **Qwen2.5-7B-based checkpoint** used in the LongVideoAgent project. Project links: - Project page: https://longvideoagent.github.io/ - Code: https://github.com/longvideoagent/LongVideoAgent - Paper: https://arxiv.org/abs/2512.20618 ## Overview LongVideoAgent decomposes long-video reasoning into multiple roles: - `MasterAgent` for planning and answer generation - `GroundingAgent` for subtitle-based temporal grounding - `VisionAgent` for local visual evidence extraction This checkpoint is intended for use with the official LongVideoAgent codebase and evaluation pipeline. ## LongTVQA+ Performance Compared with the Qwen2.5-7B-Instruct baseline, this checkpoint improves LongTVQA+ accuracy by 6.66 percentage points. | Model | LongTVQA+ Acc | Delta | | --- | ---: | ---: | | Qwen2.5-7B-Instruct | 57.33% | - | | LongVideoAgent Qwen2.5-7B | 64.00% | +6.66% | ## Intended Use Use this model for: - Research on long-video question answering - Reproducing LongVideoAgent experiments - Studying agentic reasoning over long videos This checkpoint is **not** a general-purpose video model by itself. For inference and evaluation, please use the official repository: - https://github.com/longvideoagent/LongVideoAgent ## Usage Please follow the setup and inference instructions in the official repository and project documentation: - https://github.com/longvideoagent/LongVideoAgent - https://longvideoagent.github.io/ If you use this checkpoint in your work, please also cite the LongVideoAgent paper below. ## Citation ```bibtex @misc{liu2025longvideoagentmultiagentreasoninglong, title={LongVideoAgent: Multi-Agent Reasoning with Long Videos}, author={Runtao Liu and Ziyi Liu and Jiaqi Tang and Yue Ma and Renjie Pi and Jipeng Zhang and Qifeng Chen}, year={2025}, eprint={2512.20618}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2512.20618}, } ``` ## Acknowledgement This checkpoint is built for the LongVideoAgent project and is based on **Qwen2.5-7B-Instruct**.