初始化项目，由ModelHub XC社区提供模型

Model: andresnowak/Qwen3-0.6B-instruction-finetuned Source: Original Platform
2026-06-21 07:17:18 +08:00
commit 101f110fc5
13 changed files with 151944 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,176 @@
+---
+base_model: unsloth/Qwen3-0.6B-Base
+library_name: transformers
+model_name: Qwen3-0.6B-instruction-finetuned
+tags:
+- generated_from_trainer
+- unsloth
+- trl
+- sft
+licence: license
+datasets:
+- andresnowak/Instruction-finetuning-mixture-mnlp
+language:
+- en
+---
+
+# Model Card for Qwen3-0.6B-instruction-finetuned
+
+This model is a fine-tuned version of [unsloth/Qwen3-0.6B-Base](https://huggingface.co/unsloth/Qwen3-0.6B-Base).
+It has been trained using [TRL](https://github.com/huggingface/trl).
+
+## Quick start
+
+```python
+from transformers import pipeline
+
+question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
+generator = pipeline("text-generation", model="andresnowak/Qwen3-0.6B-instruction-finetuned", device="cuda")
+output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
+print(output["generated_text"])
+```
+
+## Training procedure
+
+This model was done using Language modelling (loss done on prompt and completion) Supervised instruction finetuning and this model was also trained by applying some ranom templates
+as to be able to have more robustness as how questions will be asked apart from the dataest already bein high quality and having a lot of this examples, this was done as we weren't
+allowed to use chat templates for the evaluation. 
+But this model probably had two problems during training, one being that we didn't filter the dataset to just have examples that combined (prompt and completion) have a size of 2048 (the max size we are using) and instead
+doing a truncation. Also this model uses left side padding in the tokenizer as flash-attention 2 needs this
+
+```yaml
+
+environment:
+  seed: 42
+  use_template: True
+
+model:
+  name: Qwen/Qwen3-0.6B-Base
+  hub_model_id: andresnowak/Qwen3-0.6B-instruction-finetuned
+
+dataset:
+  - name: andresnowak/Instruction-finetuning-mixture-mnlp
+    config: codeAlpaca
+    size: 0.3
+  - name: andresnowak/Instruction-finetuning-mixture-mnlp
+    config: noRobots
+    size: 0.8
+  - name: andresnowak/Instruction-finetuning-mixture-mnlp
+    config: openMathGsm8k
+    size: 0.3
+  - name: andresnowak/Instruction-finetuning-mixture-mnlp
+    config: codeV2
+    size: 0.3
+  - name: andresnowak/Instruction-finetuning-mixture-mnlp
+    config: flanV2
+    size: 0.8
+  - name: andresnowak/Instruction-finetuning-mixture-mnlp
+    config: ifData
+    size: 0.8
+  - name: andresnowak/Instruction-finetuning-mixture-mnlp
+    config: mathAlgebra 
+    size: 0.3
+  - name: andresnowak/Instruction-finetuning-mixture-mnlp
+    config: mathGrade
+    size: 0.3
+  - name: andresnowak/Instruction-finetuning-mixture-mnlp
+    config: oasst1
+    size: 0.6
+  - name: andresnowak/Instruction-finetuning-mixture-mnlp
+    config: sciriff
+    size: 0.8
+  - name: andresnowak/Instruction-finetuning-mixture-mnlp
+    config: tableGpt
+    size: 0.3
+  - name: andresnowak/Instruction-finetuning-mixture-mnlp
+    config: tirMath
+    size: 0.4
+  - name: andresnowak/Instruction-finetuning-mixture-mnlp
+    config: wildChat
+    size: 0.7
+  - name: andresnowak/Instruction-finetuning-mixture-mnlp
+    config: mathV5
+    size: 0.2
+
+dataset_evaluation:
+  - name: cais/mmlu
+    config: validation
+    subjects: ["abstract_algebra", "anatomy", "astronomy", "college_biology", "college_chemistry", "college_computer_science", "college_mathematics", "college_physics", "computer_security", "conceptual_physics", "electrical_engineering", "elementary_mathematics", "high_school_biology", "high_school_chemistry", "high_school_computer_science", "high_school_mathematics", "high_school_physics", "high_school_statistics", "machine_learning"]
+
+training:
+  learning_rate: 1e-5
+  per_device_train_batch_size: 16
+  per_device_eval_batch_size: 16
+  gradient_accumulation_steps: 8
+  num_train_epochs: 2
+  weight_decay: 0.00
+  warmup_ratio: 0.03
+  max_grad_norm: 0.5
+  lr_scheduler: "linear"
+ ```
+
+
+This model was trained with SFT.
+
+## Evaluation results
+
+The performance is as follows:
+
+| Benchmark          | Accuracy (Acc) | Normalized Accuracy (Acc Norm) |
+| :----------------- | :------------- | :----------------------------- |
+| ARC Challenge      | 46.0%          | 45.3%                          |
+| ARC Easy           | 59.3%          | 54.2%                          |
+| GPQA               | 29.9%          | 27.0%                          |
+| Math QA            | 24.0%          | 24.8%                          |
+| MCQA Evals         | 37.9%          | 34.9%                          |
+| MMLU               | 47.2%          | 47.2%                          |
+| MMLU Pro           | 13.2%          | 12.0%                          |
+| MuSR               | 43.5%          | 42.1%                          |
+| NLP4Education      | 38.8%          | 36.5%                          |
+| **Overall**        | **37.8%**      | **36.0%**                      |
+
+The tests where done with this prompt (And only MusR used a different one where you add the Question: and Narrative: )
+```
+This question assesses challenging STEM problems as found on graduate standardized tests. Carefully evaluate the options and select the correct answer.
+
+---
+[Insert Question Here]
+---
+[Insert Choices Here, e.g.:
+A. Option 1
+B. Option 2
+C. Option 3
+D. Option 4]
+---
+
+Your response should include the letter and the exact text of the correct choice.
+Example: B. Entropy increases.
+Answer:
+```
+
+And the teseting was done on ``` [Letter]. [Text answer]```
+
+### Framework versions
+
+- TRL: 0.15.2
+- Transformers: 4.51.3
+- Pytorch: 2.5.1+cu121
+- Datasets: 3.6.0
+- Tokenizers: 0.21.0
+
+## Citations
+
+
+
+Cite TRL as:
+    
+```bibtex
+@misc{vonwerra2022trl,
+	title        = {{TRL: Transformer Reinforcement Learning}},
+	author       = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
+	year         = 2020,
+	journal      = {GitHub repository},
+	publisher    = {GitHub},
+	howpublished = {\url{https://github.com/huggingface/trl}}
+}
+```