初始化项目，由ModelHub XC社区提供模型

Model: Darwin_Project/MUSEG-3B Source: Original Platform
2026-05-19 11:46:36 +08:00
commit 641d0793d9
16 changed files with 152682 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,36 @@
+---
+frameworks:
+- Pytorch
+license: Apache License 2.0
+tasks:
+- video-question-answering
+language:
+  - en
+base_model:
+  - Qwen/Qwen2.5-VL-3B-Instruct
+base_model_relation: finetune
+metrics:
+  - f1
+---
+# MUSEG-3B
+
+[Paper](https://arxiv.org/abs/2505.20715) | [GitHub](https://github.com/THUNLP-MT/MUSEG)
+
+We propose MUSEG 🌟, a novel RL-based method that enhances temporal understanding by introducing timestamp-aware multi-segment grounding. MUSEG enables MLLMs to align queries with multiple relevant video segments, promoting more comprehensive temporal reasoning ⏳. To facilitate effective learning, we design a customized RL training recipe with phased rewards that progressively guides the model toward temporally grounded reasoning. Extensive experiments on temporal grounding and time-sensitive video QA tasks demonstrate that MUSEG significantly outperforms existing methods and generalizes well across diverse temporal understanding scenarios 🚀.
+
+## More Details
+
+Please refer to our [GitHub Repository](https://github.com/THUNLP-MT/MUSEG) for more details about this model.
+
+## Citation
+
+If you find our work helpful for your research, please consider citing our work.
+
+```plain
+@article{luo2025museg,
+    title={MUSEG: Reinforcing Video Temporal Understanding via Timestamp-Aware Multi-Segment Grounding}, 
+    author={Fuwen Luo and Shengfeng Lou and Chi Chen and Ziyue Wang and Chenliang Li and Weizhou Shen and Jiyue Guo and Peng Li and Ming Yan and Ji Zhang and Fei Huang and Yang Liu},
+    journal={arXiv preprint arXiv:2505.20715},
+    year={2025}
+}
+```