ea1ff964fdcd93671f909ed6458db5def631d1ce
Model: PKU-Alignment/s1-m_7b_beta Source: Original Platform
language, license, pipeline_tag, tags, base_model
| language | license | pipeline_tag | tags | base_model | |||
|---|---|---|---|---|---|---|---|
|
cc-by-nc-4.0 | image-text-to-text |
|
|
S1-M-7B-Beta
🏠 Homepage | 👍 Our Official Code Repo | 🤗 S1-M Dataset (Beta)
S1-M-7B-Beta used for developing the algorithm "Simple Test-time Scaling in Multimodal Reasoning". By fine-tuning the base model Qwen/Qwen2-VL-7B-Instruct on data with thinking tags <think> and </think>, the model acquired the think first, then response paradigm, allowing for experiments on "Test-time Scaling".
Note: The current model is a development version, not the final official version.
Description