Files
Qwen3-0.6B-Treatment/README.md

54 lines
1.6 KiB
Markdown
Raw Permalink Normal View History

---
license: apache-2.0
tags:
- unsloth
- trl
- sft
datasets:
- suayptalha/Treatment-Instructions
language:
- en
base_model:
- Qwen/Qwen3-0.6B
pipeline_tag: text-generation
library_name: transformers
---
# Qwen3-0.6B-Treatment-Expert
This project performs full fine-tuning on the **Qwen3-0.6B** language model to enhance its clinical treatment planning and reasoning capabilities. The model was optimized using the bfloat16 (bf16) data type.
## Training Procedure
1. **Dataset Preparation**
* Dataset: Containing paired clinical diagnosis descriptions and corresponding step-by-step treatment plans.
2. **Model Loading and Configuration**
* Base model: **Qwen3-0.6B**, loaded with the `unsloth` library in bf16 precision.
* Full fine-tuning (`full_finetuning=True`) applied to all layers to adapt the model for medical treatment tasks.
3. **Supervised Fine-Tuning (SFT)**
* Utilized the Hugging Face TRL library with the Supervised Fine-Tuning approach.
* The model was trained to generate both intermediate reasoning steps and final treatment recommendations.
* Training hyperparameters:
* Epochs: 2
* Learning rate: 2e-5
* Batch size: 8
## Purpose and Outcome
* Significantly improved the models ability to interpret clinical diagnoses and propose structured treatment plans.
## Evaluation
* Performance was measured on a held-out validation set with the following metrics:
* **Plan Fidelity:** 59.69% similarity with DeepSeek V3-0324.
* **Reasoning Coherence:** Rated high by a panel of medical experts.
## License
This project is licensed under the Apache License 2.0. See the [LICENSE](./LICENSE) file for details.