95 lines
3.6 KiB
Markdown
95 lines
3.6 KiB
Markdown
|
|
---
|
||
|
|
library_name: transformers
|
||
|
|
tags:
|
||
|
|
- text-generation-inference
|
||
|
|
- code
|
||
|
|
- math
|
||
|
|
- R1
|
||
|
|
license: apache-2.0
|
||
|
|
language:
|
||
|
|
- en
|
||
|
|
base_model:
|
||
|
|
- deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
|
||
|
|
pipeline_tag: text-generation
|
||
|
|
---
|
||
|
|
|
||
|
|

|
||
|
|
|
||
|
|
# **TESS-QwenRe-1.5B**
|
||
|
|
|
||
|
|
> **TESS-QwenRe-1.5B** is a **chain-of-thought reasoning model**, distilled from **DeepSeek R1 1.5B** and fine-tuned from **Qwen-1.5B**. It is designed to tackle **mathematical problems** in **English** and **Chinese**, with an emphasis on **long-context reasoning** and **step-by-step explanations** — ideal for tutoring, competitive exam preparation, and STEM education tools.
|
||
|
|
|
||
|
|
## **Key Features**
|
||
|
|
|
||
|
|
1. **Chain-of-Thought Math Reasoning**
|
||
|
|
Trained to generate intermediate reasoning steps, TESS-QwenRe-1.5B offers transparent and interpretable solutions for math problems — essential for educational clarity and verification.
|
||
|
|
|
||
|
|
2. **Bilingual Support (English + Chinese)**
|
||
|
|
Supports mathematical problem solving and explanation in **both English and Simplified Chinese**, enabling global and bilingual learning applications.
|
||
|
|
|
||
|
|
3. **Long-Context Problem Solving**
|
||
|
|
Specially optimized for solving multi-step, long-form math problems — perfect for word problems, reasoning chains, and competitive math exams.
|
||
|
|
|
||
|
|
4. **Distilled from DeepSeek R1 1.5B**
|
||
|
|
Combines the reasoning capabilities of **DeepSeek R1** with the lightweight and efficient architecture of **Qwen-1.5B**, delivering powerful results in a compact footprint.
|
||
|
|
|
||
|
|
5. **Step-by-Step Explanations**
|
||
|
|
Mimics expert human problem solving with clear, structured steps that help learners follow along and develop understanding.
|
||
|
|
|
||
|
|
## **Quickstart with Transformers**
|
||
|
|
|
||
|
|
```python
|
||
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||
|
|
|
||
|
|
model_name = "prithivMLmods/TESS-QwenRe-1.5B"
|
||
|
|
|
||
|
|
model = AutoModelForCausalLM.from_pretrained(
|
||
|
|
model_name,
|
||
|
|
torch_dtype="auto",
|
||
|
|
device_map="auto"
|
||
|
|
)
|
||
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
||
|
|
|
||
|
|
prompt = "Solve: A train travels 180 km in 3 hours. What is its average speed?"
|
||
|
|
messages = [
|
||
|
|
{"role": "system", "content": "You are a helpful tutor skilled in solving math problems with step-by-step explanations."},
|
||
|
|
{"role": "user", "content": prompt}
|
||
|
|
]
|
||
|
|
text = tokenizer.apply_chat_template(
|
||
|
|
messages,
|
||
|
|
tokenize=False,
|
||
|
|
add_generation_prompt=True
|
||
|
|
)
|
||
|
|
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
|
||
|
|
|
||
|
|
generated_ids = model.generate(
|
||
|
|
**model_inputs,
|
||
|
|
max_new_tokens=512
|
||
|
|
)
|
||
|
|
generated_ids = [
|
||
|
|
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
|
||
|
|
]
|
||
|
|
|
||
|
|
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
||
|
|
```
|
||
|
|
|
||
|
|
## **Intended Use**
|
||
|
|
|
||
|
|
- **Math Tutoring Assistants**: Ideal for school and exam-level math instruction with detailed explanations.
|
||
|
|
- **Bilingual EdTech Apps**: Useful in Chinese-English math learning platforms.
|
||
|
|
- **STEM Reasoning Tasks**: Reasoning support for science, engineering, and logical problem domains.
|
||
|
|
- **Efficient LLM Deployments**: Well-suited for on-device or browser-based reasoning agents.
|
||
|
|
|
||
|
|
## **Limitations**
|
||
|
|
|
||
|
|
1. **Specialized Domain**:
|
||
|
|
Tuned for math and logic; may be less effective in open-ended or creative tasks.
|
||
|
|
|
||
|
|
2. **Compact Model Constraints**:
|
||
|
|
As a 1.5B parameter model, it may underperform on extremely complex or abstract problems versus larger models.
|
||
|
|
|
||
|
|
3. **Inherited Bias**:
|
||
|
|
Distilled and fine-tuned from larger models; outputs should be monitored in sensitive contexts.
|
||
|
|
|
||
|
|
4. **Prompt Dependency**:
|
||
|
|
Accurate and structured prompts lead to the best outcomes in problem-solving scenarios.
|