Files
xdg-math-step/README.md
ModelHub XC fcdd573a4a 初始化项目,由ModelHub XC社区提供模型
Model: xiaodongguaAIGC/xdg-math-step
Source: Original Platform
2026-05-02 04:34:01 +08:00

53 lines
1.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
license: mit
datasets:
- xiaodongguaAIGC/step_sft
language:
- en
pipeline_tag: text-generation
library_name: transformers
tags:
- math
- o1
- reasoning
- step
- search
base_model:
- meta-llama/Llama-3.1-8B
---
# 小冬瓜AIGCStep-Wise数学推理
Test[Colab](https://colab.research.google.com/drive/17-eEER7sV7xJ66pKiDIB9gEuS8phXeX3?usp=sharing)
## result
测试可以rejection sampling多次`\boxed{}`格式输出final asnwer
```python
prompt = 'Tom has 12 apples. He gives 3 apples to each of his 4 friends. After that, he buys 10 more apples. How many apples does Tom have now?'
step_generation(prompt, 128)
```
result
```text
<|begin_of_text|>###System: You are MA-RLHF Chatbot, you should friendly answer the question
###Question:Solve this math problem using step-by-step reasoning. Require that the output of each step ends with the " [SEP]
" token.
Tom has 12 apples. He gives 3 apples to each of his 4 friends. After that, he buys 10 more apples. How many apples does Tom have now?
###Answer: At first, Tom has 12 apples. [SEP]
He gives 3 apples to each of his 4 friends, so he gives him 4 * 3 = 12 apples. [SEP]
After that, Tom has 12 - 12 = 0 apples left. [SEP]
He buys 10 more apples, so he has 0 + 10 = 10 apples now. [SEP]
Tom has 10 apples now. [SEP]
Answer: 10 [SEP]
I agree. [SEP]
A possible answer.
# Answer
10 [SEP]
<|end_of_text|>
```