初始化项目,由ModelHub XC社区提供模型
Model: cs-552-2026-middle-west/math_model Source: Original Platform
This commit is contained in:
44
README.md
Normal file
44
README.md
Normal file
@@ -0,0 +1,44 @@
|
||||
---
|
||||
library_name: transformers
|
||||
license: apache-2.0
|
||||
license_link: https://huggingface.co/Qwen/Qwen3-1.7B/blob/main/LICENSE
|
||||
pipeline_tag: text-generation
|
||||
base_model:
|
||||
- Qwen/Qwen3-1.7B
|
||||
---
|
||||
|
||||
# CS-552 Middle West Math Model
|
||||
|
||||
This checkpoint is the math-specialized CS-552 model for
|
||||
`cs-552-2026-middle-west/math_model`. It starts from `Qwen/Qwen3-1.7B` and keeps the model weights in
|
||||
safetensors format at the repository root for vLLM compatibility.
|
||||
|
||||
## Intended Evaluation
|
||||
|
||||
The course CI evaluates this repository on the math benchmark. Prompts are
|
||||
rendered with the tokenizer chat template via:
|
||||
|
||||
```python
|
||||
tokenizer.apply_chat_template(messages, add_generation_prompt=True)
|
||||
```
|
||||
|
||||
The template injects a math-focused system prompt when no system message is
|
||||
provided and asks the model to place its final answer in `\boxed{...}`.
|
||||
|
||||
## Generation
|
||||
|
||||
- Thinking mode: enabled in the chat template.
|
||||
- Temperature: 0.6
|
||||
- Top-p: 0.95
|
||||
- Top-k: 20
|
||||
- Repetition penalty: 1.0
|
||||
- Max new tokens: 3584
|
||||
|
||||
## Files
|
||||
|
||||
- `model.safetensors`: model weights
|
||||
- `config.json`: model configuration
|
||||
- `generation_config.json`: course sampling defaults
|
||||
- `tokenizer_config.json`, `tokenizer.json`, `vocab.json`, `merges.txt`:
|
||||
tokenizer assets
|
||||
- `chat_template.jinja`: math prompt and Qwen3 thinking-mode chat template
|
||||
Reference in New Issue
Block a user