初始化项目，由ModelHub XC社区提供模型

Model: xiaodongguaAIGC/xdg-math-step Source: Original Platform
2026-05-02 04:34:01 +08:00
commit fcdd573a4a
13 changed files with 2550 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,53 @@
+---
+license: mit
+datasets:
+- xiaodongguaAIGC/step_sft
+language:
+- en
+pipeline_tag: text-generation
+library_name: transformers
+tags:
+- math
+- o1
+- reasoning
+- step
+- search
+base_model:
+- meta-llama/Llama-3.1-8B
+---
+
+
+# 小冬瓜AIGC：Step-Wise数学推理
+
+Test：[Colab](https://colab.research.google.com/drive/17-eEER7sV7xJ66pKiDIB9gEuS8phXeX3?usp=sharing)
+
+## result
+
+测试可以rejection sampling多次，以`\boxed{}`格式输出final asnwer
+
+```python
+prompt = 'Tom has 12 apples. He gives 3 apples to each of his 4 friends. After that, he buys 10 more apples. How many apples does Tom have now?'
+step_generation(prompt, 128)
+```
+
+result 
+
+```text
+<|begin_of_text|>###System: You are MA-RLHF Chatbot, you should friendly answer the question
+###Question:Solve this math problem using step-by-step reasoning. Require that the output of each step ends with the " [SEP]
+" token.
+Tom has 12 apples. He gives 3 apples to each of his 4 friends. After that, he buys 10 more apples. How many apples does Tom have now?
+###Answer: At first, Tom has 12 apples. [SEP]
+He gives 3 apples to each of his 4 friends, so he gives him 4 * 3 = 12 apples. [SEP]
+After that, Tom has 12 - 12 = 0 apples left. [SEP]
+He buys 10 more apples, so he has 0 + 10 = 10 apples now. [SEP]
+Tom has 10 apples now. [SEP]
+Answer: 10 [SEP]
+I agree. [SEP]
+A possible answer.
+
+# Answer
+
+10 [SEP]
+<|end_of_text|>
+```