Fewshot-Metamath-OrcaVicuna…/README.md

---
license: apache-2.0
base_model: mistralai/Mistral-7B-v0.1
datasets:
- abacusai/MetaMathFewshot
- shahules786/orca-chat
- anon8231489123/ShareGPT_Vicuna_unfiltered
---

![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c14f6b02e1f8f67c73bd05/pf4d6FA7DriRtVq5HCkxd.png)

This model was trained on our [MetamathFewshot](https://huggingface.co/datasets/abacusai/MetaMathFewshot) dataset, as well as the [Vicuna](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered) dataset and the [OrcaChat](https://huggingface.co/datasets/shahules786/orca-chat) dataset.

It has been finetuned from base [Mistral 7B](https://huggingface.co/mistralai/Mistral-7B-v0.1)

# Usage

This model uses a specific prompt format which is encoded as a [chat template](https://huggingface.co/docs/transformers/main/en/chat_templating). To apply this, you can use the tokenizer.apply_chat_template() method of the attached tokenizer:

```python
messages = [
    {"role": "user", "content": "What is the capital of Spain?"},
    {"role": "assistant", "content": "The capital of Spain is Madrid."}
]
gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
model.generate(**gen_input)
```

# Evaluation Results

### HuggingFace Leaderboard

| Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
| --- | --- | --- | --- | --- | --- | --- |
| 67.33    | 59.64 | 81.82 | 61.69 | 53.23 | 78.45 | 69.14 |

For comparison the GSM8K score for the original `metamath/MetaMath-Mistral-7B` was 68.84 and average score was 65.78.

### MT-Bench

| Turn 1 | Turn 2 | Average |
| ---    |   ---  |   ---   |
| 6.90    | 6.52   |   6.71  |

# Training Details

Instruction tuned with the following parameters:

- LORA, Rank 8, Alpha 16, Dropout 0.05, all modules (QKV and MLP)
- 3 epochs
- Micro Batch Size 32 over 4xH100, gradient accumulation steps = 1
- AdamW with learning rate 5e-5

# Bias, Risks, and Limitations

The model has not been evaluated for safety and is only intended for research and experiments.
初始化项目，由ModelHub XC社区提供模型 Model: abacusai/Fewshot-Metamath-OrcaVicuna-Mistral Source: Original Platform 2026-05-08 06:45:04 +08:00			`---`
			`license: apache-2.0`
			`base_model: mistralai/Mistral-7B-v0.1`
			`datasets:`
			`- abacusai/MetaMathFewshot`
			`- shahules786/orca-chat`
			`- anon8231489123/ShareGPT_Vicuna_unfiltered`
			`---`

			`![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c14f6b02e1f8f67c73bd05/pf4d6FA7DriRtVq5HCkxd.png)`

			`This model was trained on our [MetamathFewshot](https://huggingface.co/datasets/abacusai/MetaMathFewshot) dataset, as well as the [Vicuna](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered) dataset and the [OrcaChat](https://huggingface.co/datasets/shahules786/orca-chat) dataset.`

			`It has been finetuned from base [Mistral 7B](https://huggingface.co/mistralai/Mistral-7B-v0.1)`

			`# Usage`

			`This model uses a specific prompt format which is encoded as a [chat template](https://huggingface.co/docs/transformers/main/en/chat_templating). To apply this, you can use the tokenizer.apply_chat_template() method of the attached tokenizer:`

			```python
			`messages = [`
			`{"role": "user", "content": "What is the capital of Spain?"},`
			`{"role": "assistant", "content": "The capital of Spain is Madrid."}`
			`]`
			`gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")`
			`model.generate(**gen_input)`
			```

			`# Evaluation Results`

			`### HuggingFace Leaderboard`

			`\| Average \| ARC \| HellaSwag \| MMLU \| TruthfulQA \| Winogrande \| GSM8K \|`
			`\| --- \| --- \| --- \| --- \| --- \| --- \| --- \|`
			`\| 67.33 \| 59.64 \| 81.82 \| 61.69 \| 53.23 \| 78.45 \| 69.14 \|`

			For comparison the GSM8K score for the original `metamath/MetaMath-Mistral-7B` was 68.84 and average score was 65.78.

			`### MT-Bench`

			`\| Turn 1 \| Turn 2 \| Average \|`
			`\| --- \| --- \| --- \|`
			`\| 6.90 \| 6.52 \| 6.71 \|`

			`# Training Details`

			`Instruction tuned with the following parameters:`

			`- LORA, Rank 8, Alpha 16, Dropout 0.05, all modules (QKV and MLP)`
			`- 3 epochs`
			`- Micro Batch Size 32 over 4xH100, gradient accumulation steps = 1`
			`- AdamW with learning rate 5e-5`

			`# Bias, Risks, and Limitations`

			`The model has not been evaluated for safety and is only intended for research and experiments.`