s1-m_7b_beta/README.md

---
language:
- en
license: cc-by-nc-4.0
pipeline_tag: image-text-to-text
tags:
  - multimodal
base_model:
  - Qwen/Qwen2-VL-7B-Instruct
---
# S1-M-7B-Beta

[🏠 Homepage](https://github.com/PKU-Alignment/s1-m) | [👍 Our Official Code Repo](https://github.com/PKU-Alignment/s1-m) | [🤗 S1-M Dataset (Beta)](https://huggingface.co/datasets/PKU-Alignment/s1-m_beta) 

S1-M-7B-Beta used for developing the algorithm "Simple Test-time Scaling in Multimodal Reasoning". By fine-tuning the base model `Qwen/Qwen2-VL-7B-Instruct` on data with thinking tags `<think>` and `</think>`, the model acquired the `think first, then response` paradigm, allowing for experiments on "Test-time Scaling".

**Note: The current model is a development version, not the final official version.**
初始化项目，由ModelHub XC社区提供模型 Model: PKU-Alignment/s1-m_7b_beta Source: Original Platform 2026-05-27 07:48:13 +08:00			`---`
			`language:`
			`- en`
			`license: cc-by-nc-4.0`
			`pipeline_tag: image-text-to-text`
			`tags:`
			`- multimodal`
			`base_model:`
			`- Qwen/Qwen2-VL-7B-Instruct`
			`---`
			`# S1-M-7B-Beta`

			`[🏠 Homepage](https://github.com/PKU-Alignment/s1-m) \| [👍 Our Official Code Repo](https://github.com/PKU-Alignment/s1-m) \| [🤗 S1-M Dataset (Beta)](https://huggingface.co/datasets/PKU-Alignment/s1-m_beta)`

			S1-M-7B-Beta used for developing the algorithm "Simple Test-time Scaling in Multimodal Reasoning". By fine-tuning the base model `Qwen/Qwen2-VL-7B-Instruct` on data with thinking tags `<think>` and `</think>`, the model acquired the `think first, then response` paradigm, allowing for experiments on "Test-time Scaling".

			`Note: The current model is a development version, not the final official version.`