初始化项目,由ModelHub XC社区提供模型
Model: andresnowak/Qwen3-0.6B-instruction-finetuned Source: Original Platform
This commit is contained in:
176
README.md
Normal file
176
README.md
Normal file
@@ -0,0 +1,176 @@
|
||||
---
|
||||
base_model: unsloth/Qwen3-0.6B-Base
|
||||
library_name: transformers
|
||||
model_name: Qwen3-0.6B-instruction-finetuned
|
||||
tags:
|
||||
- generated_from_trainer
|
||||
- unsloth
|
||||
- trl
|
||||
- sft
|
||||
licence: license
|
||||
datasets:
|
||||
- andresnowak/Instruction-finetuning-mixture-mnlp
|
||||
language:
|
||||
- en
|
||||
---
|
||||
|
||||
# Model Card for Qwen3-0.6B-instruction-finetuned
|
||||
|
||||
This model is a fine-tuned version of [unsloth/Qwen3-0.6B-Base](https://huggingface.co/unsloth/Qwen3-0.6B-Base).
|
||||
It has been trained using [TRL](https://github.com/huggingface/trl).
|
||||
|
||||
## Quick start
|
||||
|
||||
```python
|
||||
from transformers import pipeline
|
||||
|
||||
question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
|
||||
generator = pipeline("text-generation", model="andresnowak/Qwen3-0.6B-instruction-finetuned", device="cuda")
|
||||
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
|
||||
print(output["generated_text"])
|
||||
```
|
||||
|
||||
## Training procedure
|
||||
|
||||
This model was done using Language modelling (loss done on prompt and completion) Supervised instruction finetuning and this model was also trained by applying some ranom templates
|
||||
as to be able to have more robustness as how questions will be asked apart from the dataest already bein high quality and having a lot of this examples, this was done as we weren't
|
||||
allowed to use chat templates for the evaluation.
|
||||
But this model probably had two problems during training, one being that we didn't filter the dataset to just have examples that combined (prompt and completion) have a size of 2048 (the max size we are using) and instead
|
||||
doing a truncation. Also this model uses left side padding in the tokenizer as flash-attention 2 needs this
|
||||
|
||||
```yaml
|
||||
|
||||
environment:
|
||||
seed: 42
|
||||
use_template: True
|
||||
|
||||
model:
|
||||
name: Qwen/Qwen3-0.6B-Base
|
||||
hub_model_id: andresnowak/Qwen3-0.6B-instruction-finetuned
|
||||
|
||||
dataset:
|
||||
- name: andresnowak/Instruction-finetuning-mixture-mnlp
|
||||
config: codeAlpaca
|
||||
size: 0.3
|
||||
- name: andresnowak/Instruction-finetuning-mixture-mnlp
|
||||
config: noRobots
|
||||
size: 0.8
|
||||
- name: andresnowak/Instruction-finetuning-mixture-mnlp
|
||||
config: openMathGsm8k
|
||||
size: 0.3
|
||||
- name: andresnowak/Instruction-finetuning-mixture-mnlp
|
||||
config: codeV2
|
||||
size: 0.3
|
||||
- name: andresnowak/Instruction-finetuning-mixture-mnlp
|
||||
config: flanV2
|
||||
size: 0.8
|
||||
- name: andresnowak/Instruction-finetuning-mixture-mnlp
|
||||
config: ifData
|
||||
size: 0.8
|
||||
- name: andresnowak/Instruction-finetuning-mixture-mnlp
|
||||
config: mathAlgebra
|
||||
size: 0.3
|
||||
- name: andresnowak/Instruction-finetuning-mixture-mnlp
|
||||
config: mathGrade
|
||||
size: 0.3
|
||||
- name: andresnowak/Instruction-finetuning-mixture-mnlp
|
||||
config: oasst1
|
||||
size: 0.6
|
||||
- name: andresnowak/Instruction-finetuning-mixture-mnlp
|
||||
config: sciriff
|
||||
size: 0.8
|
||||
- name: andresnowak/Instruction-finetuning-mixture-mnlp
|
||||
config: tableGpt
|
||||
size: 0.3
|
||||
- name: andresnowak/Instruction-finetuning-mixture-mnlp
|
||||
config: tirMath
|
||||
size: 0.4
|
||||
- name: andresnowak/Instruction-finetuning-mixture-mnlp
|
||||
config: wildChat
|
||||
size: 0.7
|
||||
- name: andresnowak/Instruction-finetuning-mixture-mnlp
|
||||
config: mathV5
|
||||
size: 0.2
|
||||
|
||||
dataset_evaluation:
|
||||
- name: cais/mmlu
|
||||
config: validation
|
||||
subjects: ["abstract_algebra", "anatomy", "astronomy", "college_biology", "college_chemistry", "college_computer_science", "college_mathematics", "college_physics", "computer_security", "conceptual_physics", "electrical_engineering", "elementary_mathematics", "high_school_biology", "high_school_chemistry", "high_school_computer_science", "high_school_mathematics", "high_school_physics", "high_school_statistics", "machine_learning"]
|
||||
|
||||
training:
|
||||
learning_rate: 1e-5
|
||||
per_device_train_batch_size: 16
|
||||
per_device_eval_batch_size: 16
|
||||
gradient_accumulation_steps: 8
|
||||
num_train_epochs: 2
|
||||
weight_decay: 0.00
|
||||
warmup_ratio: 0.03
|
||||
max_grad_norm: 0.5
|
||||
lr_scheduler: "linear"
|
||||
```
|
||||
|
||||
|
||||
This model was trained with SFT.
|
||||
|
||||
## Evaluation results
|
||||
|
||||
The performance is as follows:
|
||||
|
||||
| Benchmark | Accuracy (Acc) | Normalized Accuracy (Acc Norm) |
|
||||
| :----------------- | :------------- | :----------------------------- |
|
||||
| ARC Challenge | 46.0% | 45.3% |
|
||||
| ARC Easy | 59.3% | 54.2% |
|
||||
| GPQA | 29.9% | 27.0% |
|
||||
| Math QA | 24.0% | 24.8% |
|
||||
| MCQA Evals | 37.9% | 34.9% |
|
||||
| MMLU | 47.2% | 47.2% |
|
||||
| MMLU Pro | 13.2% | 12.0% |
|
||||
| MuSR | 43.5% | 42.1% |
|
||||
| NLP4Education | 38.8% | 36.5% |
|
||||
| **Overall** | **37.8%** | **36.0%** |
|
||||
|
||||
The tests where done with this prompt (And only MusR used a different one where you add the Question: and Narrative: )
|
||||
```
|
||||
This question assesses challenging STEM problems as found on graduate standardized tests. Carefully evaluate the options and select the correct answer.
|
||||
|
||||
---
|
||||
[Insert Question Here]
|
||||
---
|
||||
[Insert Choices Here, e.g.:
|
||||
A. Option 1
|
||||
B. Option 2
|
||||
C. Option 3
|
||||
D. Option 4]
|
||||
---
|
||||
|
||||
Your response should include the letter and the exact text of the correct choice.
|
||||
Example: B. Entropy increases.
|
||||
Answer:
|
||||
```
|
||||
|
||||
And the teseting was done on ``` [Letter]. [Text answer]```
|
||||
|
||||
### Framework versions
|
||||
|
||||
- TRL: 0.15.2
|
||||
- Transformers: 4.51.3
|
||||
- Pytorch: 2.5.1+cu121
|
||||
- Datasets: 3.6.0
|
||||
- Tokenizers: 0.21.0
|
||||
|
||||
## Citations
|
||||
|
||||
|
||||
|
||||
Cite TRL as:
|
||||
|
||||
```bibtex
|
||||
@misc{vonwerra2022trl,
|
||||
title = {{TRL: Transformer Reinforcement Learning}},
|
||||
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
|
||||
year = 2020,
|
||||
journal = {GitHub repository},
|
||||
publisher = {GitHub},
|
||||
howpublished = {\url{https://github.com/huggingface/trl}}
|
||||
}
|
||||
```
|
||||
Reference in New Issue
Block a user