qwen2.5-0.5b-ifeval-mixed-k…/README.md

---
language:
- en
library_name: transformers
tags:
- qwen
- distillation
- ifeval
- retaining-by-doing
---

# Qwen2.5-0.5B Instruct IFEval Mixed KD (alpha=0.5)

This checkpoint was distilled from a half-epoch `Qwen2.5-1.5B-Instruct` teacher trained on `IFEvalSFTDataset`.

Distillation setup:
- student: `Qwen2.5-0.5B-Instruct`
- teacher: half-epoch `Qwen2.5-1.5B-Instruct`
- `num_train_datapoints=4064`
- `num_epochs=1`
- `distill_alpha=0.5`
- `distill_temperature=2.0`
- effective loss mix: `CE 0.5 / KD 0.5`

Observed local IFEval accuracy:
- `0.4137577002`
初始化项目，由ModelHub XC社区提供模型 Model: SeongryongJung/qwen2.5-0.5b-ifeval-mixed-kd-alpha05 Source: Original Platform 2026-04-22 12:15:56 +08:00			`---`
			`language:`
			`- en`
			`library_name: transformers`
			`tags:`
			`- qwen`
			`- distillation`
			`- ifeval`
			`- retaining-by-doing`
			`---`

			`# Qwen2.5-0.5B Instruct IFEval Mixed KD (alpha=0.5)`

			This checkpoint was distilled from a half-epoch `Qwen2.5-1.5B-Instruct` teacher trained on `IFEvalSFTDataset`.

			`Distillation setup:`
			- student: `Qwen2.5-0.5B-Instruct`
			- teacher: half-epoch `Qwen2.5-1.5B-Instruct`
			- `num_train_datapoints=4064`
			- `num_epochs=1`
			- `distill_alpha=0.5`
			- `distill_temperature=2.0`
			- effective loss mix: `CE 0.5 / KD 0.5`

			`Observed local IFEval accuracy:`
			- `0.4137577002`