This qwen3 model was trained 2x faster with Unsloth and Huggingface's TRL library.
A synthetic dataset (round about 140 samples) also was being used and it had instructions like "Wait." and customer support specific data in addition to the turns-2k dataset.
Accuracy: 96.22%