---
license: apache-2.0
tags:
- unsloth
- trl
- sft
datasets:
- suayptalha/Treatment-Instructions
language:
- en
base_model:
- Qwen/Qwen3-0.6B
pipeline_tag: text-generation
library_name: transformers
---

# Qwen3-0.6B-Treatment-Expert

This project performs full fine-tuning on the **Qwen3-0.6B** language model to enhance its clinical treatment planning and reasoning capabilities. The model was optimized using the bfloat16 (bf16) data type.

## Training Procedure

1. **Dataset Preparation**

   * Dataset: Containing paired clinical diagnosis descriptions and corresponding step-by-step treatment plans.

2. **Model Loading and Configuration**

   * Base model: **Qwen3-0.6B**, loaded with the `unsloth` library in bf16 precision.
   * Full fine-tuning (`full_finetuning=True`) applied to all layers to adapt the model for medical treatment tasks.

3. **Supervised Fine-Tuning (SFT)**

   * Utilized the Hugging Face TRL library with the Supervised Fine-Tuning approach.
   * The model was trained to generate both intermediate reasoning steps and final treatment recommendations.
   * Training hyperparameters:

     * Epochs: 2
     * Learning rate: 2e-5
     * Batch size: 8

## Purpose and Outcome

* Significantly improved the model’s ability to interpret clinical diagnoses and propose structured treatment plans.

## Evaluation

* Performance was measured on a held-out validation set with the following metrics:

  * **Plan Fidelity:** 59.69% similarity with DeepSeek V3-0324.
  * **Reasoning Coherence:** Rated high by a panel of medical experts.
## License

This project is licensed under the Apache License 2.0. See the [LICENSE](./LICENSE) file for details.