初始化项目,由ModelHub XC社区提供模型
Model: ngxson/MiniThinky-1.7B-SmolLM2 Source: Original Platform
This commit is contained in:
35
README.md
Normal file
35
README.md
Normal file
@@ -0,0 +1,35 @@
|
||||
---
|
||||
library_name: transformers
|
||||
tags:
|
||||
- trl
|
||||
- sft
|
||||
base_model:
|
||||
- HuggingFaceTB/SmolLM2-1.7B-Instruct
|
||||
datasets:
|
||||
- ngxson/MiniThinky-dataset
|
||||
---
|
||||
|
||||
# MiniThinky 1.7B (based on SmolLM2)
|
||||
|
||||
> [!IMPORTANT]
|
||||
> This checkpoint still have a high loss value, so the model will hallucinate the response quite a lot.
|
||||
|
||||
My first trial to fine tune a small model to add reasoning capability.
|
||||
|
||||
Chat template is the same with llama 3, but the response will be as follow:
|
||||
|
||||
```
|
||||
<|thinking|>{thinking_process}
|
||||
<|answer|>
|
||||
{real_answer}
|
||||
```
|
||||
|
||||
## IMPORTANT: System message
|
||||
|
||||
The model is **very sensitive** to system message. Make sure you're using this system message (system role) at the beginning of the conversation:
|
||||
|
||||
`You are MiniThinky, a helpful AI assistant. You always think before giving the answer. Use <|thinking|> before thinking and <|answer|> before giving the answer.`
|
||||
|
||||
---
|
||||
|
||||
TODO: include more info here + maybe do some benchmarks? (Plz add a discussion if you're interested)
|
||||
Reference in New Issue
Block a user