初始化项目,由ModelHub XC社区提供模型
Model: SenseNova/SenseNova-SI-1.1-Qwen2.5-VL-3B Source: Original Platform
This commit is contained in:
99
README_CN.md
Normal file
99
README_CN.md
Normal file
@@ -0,0 +1,99 @@
|
||||
[EN](README.md) | **中文**
|
||||
|
||||
# SenseNova-SI: 探索空间智能在多模态基础模型上尺度效应
|
||||
|
||||
<a href="https://github.com/OpenSenseNova/SenseNova-SI" target="_blank">
|
||||
<img alt="Code" src="https://img.shields.io/badge/SenseNova_SI-Code-100000?style=flat-square&logo=github&logoColor=white" height="20" />
|
||||
</a>
|
||||
<a href="https://arxiv.org/abs/2511.13719" target="_blank">
|
||||
<img alt="arXiv" src="https://img.shields.io/badge/arXiv-SenseNova_SI-red?logo=arxiv" height="20" />
|
||||
</a>
|
||||
<a href="https://github.com/EvolvingLMMs-Lab/EASI" target="_blank">
|
||||
<img alt="Code" src="https://img.shields.io/badge/EASI-Code-100000?style=flat-square&logo=github&logoColor=white" height="20" />
|
||||
</a>
|
||||
<a href="https://easi.lmms-lab.com/leaderboard" target="_blank">
|
||||
<img alt="Leaderboard" src="https://img.shields.io/badge/%F0%9F%A4%97%20_EASI-Leaderboard-ffc107?color=ffc107&logoColor=white" height="20" />
|
||||
</a>
|
||||
|
||||
## 概览
|
||||
|
||||
尽管多模态基础模型已取得显著进展,但在空间智能方面仍存在明显不足。
|
||||
本研究基于成熟的多模态基础,包括视觉理解模型(如Qwen3-VL、InternVL3)和统一理解生成模型(如Bagel),从尺度效应(Scaling)的视角构建了**SenseNova-SI系列模型**。
|
||||
我们采用系统化方法构建了包含800万样本的SenseNova-SI-8M数据集,通过严格的空间能力分类体系培养高性能、高鲁棒性的空间能力。
|
||||
该系列模型在多项空间智能基准测试中取得突破性表现:VSI-Bench 68.7%、MMSI 43.3%、MindCube 85.6%、ViewSpatial 54.6%、SITE 50.1%,同时保持强大的通用多模态理解能力(如MMBench-En 84.9%)。
|
||||
本研究进一步分析了数据规模的影响,揭示了多样化数据训练带来的涌现泛化能力,探讨了过拟合与语言捷径的风险,提出了空间思维链推理的初步研究,并验证了下游应用潜力。
|
||||
SenseNova-SI是一个持续迭代的项目,所有新训练的多模态空间智能基础模型均将陆续开源,以推动空间智能领域的研究发展。
|
||||
*后续 SenseNova-SI 将与更大规模的内部模型进行集成。*
|
||||
|
||||
## 发布信息
|
||||
|
||||
目前,我们基于流行的开源基础模型构建 SenseNova-SI,以最大化与现有研究流程的兼容性。
|
||||
在本次发布中,我们推出
|
||||
[**SenseNova-SI-1.2-InternVL3-8B**](https://huggingface.co/sensenova/SenseNova-SI-1.2-InternVL3-8B),
|
||||
[**SenseNova-SI-1.1-Qwen2.5-VL-3B**](https://huggingface.co/sensenova/SenseNova-SI-1.1-Qwen2.5-VL-3B),
|
||||
[**SenseNova-SI-1.1-Qwen2.5-VL-7B**](https://huggingface.co/sensenova/SenseNova-SI-1.1-Qwen2.5-VL-7B),
|
||||
与[**SenseNova-SI-1.1-Qwen3-VL-8B**](https://huggingface.co/sensenova/SenseNova-SI-1.1-Qwen3-VL-8B),
|
||||
其中**SenseNova-SI-1.2-InternVL3-8B**在八个近期发布的空间智能基准测试(**VSI**、**MMSI**、**MindCube**、**ViewSpatial**、**SITE**、**BLINK**、**3DSRBench**、**EmbSpatial-Bench**)上,
|
||||
在同等模型规模下均取得了开源模型的最新最优性能(state-of-the-art)。
|
||||
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Model</th>
|
||||
<th>VSI</th>
|
||||
<th>MMSI</th>
|
||||
<th>MindCube-Tiny</th>
|
||||
<th>ViewSpatial</th>
|
||||
<th>SITE</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td colspan="6" align="center"><em>Open-source Models (~2B)</em></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>InternVL3-2B</td><td>32.9</td><td>26.5</td><td>37.5</td><td>32.5</td><td>30.0</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Qwen2.5-VL-3B-Instruct</td><td>27.0</td><td>28.6</td><td>37.6</td><td>31.9</td><td>33.1</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Qwen3-VL-2B-Instruct</td><td>50.3</td><td>28.9</td><td>34.5</td><td>36.9</td><td>35.6</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>MindCube-3B-RawQA-SFT</td><td>17.2</td><td>1.7</td><td>51.7</td><td>24.1</td><td>6.3</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>SpatialLadder-3B</td><td>44.8</td><td>27.4</td><td>43.4</td><td>39.8</td><td>27.9</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>SpatialMLLM-4B</td><td>46.3</td><td>26.1</td><td>33.4</td><td>34.6</td><td>18.0</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>VST-3B-SFT</td><td><strong>57.9</strong></td><td>30.2</td><td>35.9</td><td><strong>52.8</strong></td><td>35.8</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Cambrian-S-3B</td><td>57.3</td><td>25.2</td><td>32.5</td><td>39.0</td><td>28.3</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><strong>SenseNova-SI-1.1-Qwen2.5-VL-3B</strong></td>
|
||||
<td>54.9</strong></td>
|
||||
<td><strong>30.8</strong></td>
|
||||
<td><strong>52.6</strong></td>
|
||||
<td>43.5</td>
|
||||
<td><strong>37.8</strong></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td colspan="6" align="center"><em>Proprietary Models</em></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Gemini-2.5-pro-2025-06</td><td>53.5</td><td>38.0</td><td>57.6</td><td>46.0</td><td>57.0</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Grok-4-2025-07-09</td><td>47.9</td><td>37.8</td><td>63.5</td><td>43.2</td><td>47.0</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>GPT-5-2025-08-07</td><td>55.0</td><td>41.8</td><td>56.3</td><td>45.5</td><td>61.8</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
Reference in New Issue
Block a user