Files

26 lines
509 B
Markdown
Raw Permalink Normal View History

---
language:
- en
library_name: transformers
tags:
- qwen
- distillation
- ifeval
- retaining-by-doing
---
# Qwen2.5-0.5B Instruct IFEval Pure KD
This checkpoint was distilled from a half-epoch `Qwen2.5-1.5B-Instruct` teacher trained on `IFEvalSFTDataset`.
Distillation setup:
- student: `Qwen2.5-0.5B-Instruct`
- teacher: half-epoch `Qwen2.5-1.5B-Instruct`
- `num_train_datapoints=4064`
- `num_epochs=1`
- `distill_alpha=1.0`
- `distill_temperature=2.0`
Observed local IFEval accuracy:
- `0.4050308008`