--- license: apache-2.0 tags: - unsloth - trl - sft datasets: - suayptalha/Treatment-Instructions language: - en base_model: - Qwen/Qwen3-0.6B pipeline_tag: text-generation library_name: transformers --- # Qwen3-0.6B-Treatment-Expert This project performs full fine-tuning on the **Qwen3-0.6B** language model to enhance its clinical treatment planning and reasoning capabilities. The model was optimized using the bfloat16 (bf16) data type. ## Training Procedure 1. **Dataset Preparation** * Dataset: Containing paired clinical diagnosis descriptions and corresponding step-by-step treatment plans. 2. **Model Loading and Configuration** * Base model: **Qwen3-0.6B**, loaded with the `unsloth` library in bf16 precision. * Full fine-tuning (`full_finetuning=True`) applied to all layers to adapt the model for medical treatment tasks. 3. **Supervised Fine-Tuning (SFT)** * Utilized the Hugging Face TRL library with the Supervised Fine-Tuning approach. * The model was trained to generate both intermediate reasoning steps and final treatment recommendations. * Training hyperparameters: * Epochs: 2 * Learning rate: 2e-5 * Batch size: 8 ## Purpose and Outcome * Significantly improved the model’s ability to interpret clinical diagnoses and propose structured treatment plans. ## Evaluation * Performance was measured on a held-out validation set with the following metrics: * **Plan Fidelity:** 59.69% similarity with DeepSeek V3-0324. * **Reasoning Coherence:** Rated high by a panel of medical experts. ## License This project is licensed under the Apache License 2.0. See the [LICENSE](./LICENSE) file for details.