Files
math_model/README.md

45 lines
1.3 KiB
Markdown
Raw Normal View History

---
library_name: transformers
license: apache-2.0
license_link: https://huggingface.co/Qwen/Qwen3-1.7B/blob/main/LICENSE
pipeline_tag: text-generation
base_model:
- Qwen/Qwen3-1.7B
---
# CS-552 Middle West Math Model
This checkpoint is the math-specialized CS-552 model for
`cs-552-2026-middle-west/math_model`. It starts from `Qwen/Qwen3-1.7B` and keeps the model weights in
safetensors format at the repository root for vLLM compatibility.
## Intended Evaluation
The course CI evaluates this repository on the math benchmark. Prompts are
rendered with the tokenizer chat template via:
```python
tokenizer.apply_chat_template(messages, add_generation_prompt=True)
```
The template injects a math-focused system prompt when no system message is
provided and asks the model to place its final answer in `\boxed{...}`.
## Generation
- Thinking mode: enabled in the chat template.
- Temperature: 0.6
- Top-p: 0.95
- Top-k: 20
- Repetition penalty: 1.0
- Max new tokens: 3584
## Files
- `model.safetensors`: model weights
- `config.json`: model configuration
- `generation_config.json`: course sampling defaults
- `tokenizer_config.json`, `tokenizer.json`, `vocab.json`, `merges.txt`:
tokenizer assets
- `chat_template.jinja`: math prompt and Qwen3 thinking-mode chat template