Qwen2.5-0.5B-Medical-Reason…/README.md

---
license: apache-2.0
datasets:
- lingshu-medical-mllm/ReasonMed
base_model:
- unsloth/Qwen2.5-0.5B-Instruct
---
## Info


![AKmUU](https://cdn-uploads.huggingface.co/production/uploads/66e00ba55e4fd4bfead4a97c/zxUf9YsQPkICX5-n1BVjc.jpeg)

![Demo Screenshot](https://cdn-uploads.huggingface.co/production/uploads/66e00ba55e4fd4bfead4a97c/ZWQpqF2613W9Ty9NBYUk1.png)

# Qwen2.5-0.5B-Medical-ReasonMed370K

A 0.5 billion parameter medical reasoning model fine-tuned on the complete ReasonMed 370K dataset. This model is built on top of Qwen2.5-0.5B-Instruct and trained to perform structured clinical reasoning, differential diagnosis, and evidence-based medical question answering.

## Model Details

- **Base Model**: unsloth/Qwen2.5-0.5B-Instruct
- **Model Size**: 0.5B parameters
- **Fine-tuning Method**: LoRA via Unsloth
- **Training Dataset**: ReasonMed 370K (full dataset)
- **Training Hardware**: NVIDIA Tesla T4 (Kaggle free tier)
- **License**: Apache 2.0

## Training Details

The model was fine-tuned in two stages, each covering half of the ReasonMed dataset:

**Stage 1**: Fine-tuned on the first 185,000 samples of ReasonMed using LoRA with the following configuration:
- LoRA rank: 8
- LoRA alpha: 16
- Learning rate: 5e-5
- Batch size: 2 with 16 gradient accumulation steps
- Max sequence length: 4096
- Epochs: 1
- Optimizer: AdamW 8-bit

**Stage 2**: Continued fine-tuning on the remaining 184,983 samples with identical configuration, completing one full pass over the entire 370K dataset.

Both stages used `packing=False` to ensure every sample was processed individually without truncation.

## Dataset

This model was trained on [ReasonMed](https://huggingface.co/datasets/lingshu-medical-mllm/ReasonMed), the largest open-source medical reasoning dataset available, comprising 370,000 high-quality examples distilled from 1.75 million initial reasoning paths generated by multiple large language models.

ReasonMed is built through a multi-agent verification and refinement pipeline that includes an Error Refiner to correct error-prone reasoning steps. Each example combines detailed chain-of-thought reasoning with a concise answer summary, covering a wide range of medical topics including clinical reasoning, differential diagnosis, pharmacology, and medical question answering.

For more details on the dataset, refer to the official repository: https://github.com/alibaba-damo-academy/ReasonMed

## What the Model Can Do

After training on the full ReasonMed dataset, the model demonstrates the ability to:

- Work through clinical presentations step by step
- Generate differential diagnoses with reasoning for each option
- Rule out unlikely diagnoses with justification
- Provide structured final answers with clinical pearls
- Reason through medical multiple choice questions with explanation

## Demo

The screenshot above shows the model running through a clinical scenario involving hypothyroidism, demonstrating its ability to identify key symptoms, interpret lab values, and produce a structured response with management guidance.

## Limitations

- This is a 0.5B parameter model and has a hard ceiling on reasoning depth and factual recall
- Small models are prone to inconsistency across similar questions
- The model may occasionally hallucinate clinical details
- This model is intended for research and educational purposes only
- It should not be used for real clinical decision making or as a substitute for a qualified medical professional

## Usage
```python
from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name     = "Rumiii/Qwen2.5-0.5B-Medical-ReasonMed370K",
    max_seq_length = 4096,
    load_in_4bit   = True,
)
FastLanguageModel.for_inference(model)

messages = [
    {"role": "user", "content": "Your medical question here"}
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize              = True,
    add_generation_prompt = True,
    return_tensors        = "pt"
).to("cuda")

outputs = model.generate(
    input_ids            = inputs,
    max_new_tokens       = 1024,
    temperature          = 0.7,
    do_sample            = True,
    repetition_penalty   = 1.3,
    no_repeat_ngram_size = 3,
    top_p                = 0.9,
    top_k                = 50,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## Citation

If you use this model, please cite the ReasonMed dataset:
```bibtex
@misc{sun2025reasonmed370kmultiagentgenerated,
      title={ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning}, 
      author={Yu Sun and Xingyu Qian and Weiwen Xu and Hao Zhang and Chenghao Xiao and Long Li and Yu Rong and Wenbing Huang and Qifeng Bai and Tingyang Xu},
      year={2025},
      eprint={2506.09513},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.09513}, 
}
```

## Acknowledgements

Training was conducted on Kaggle free tier infrastructure using Unsloth for efficient fine-tuning. The ReasonMed dataset was created by the team at Alibaba DAMO Academy and Tencent AI Lab.
初始化项目，由ModelHub XC社区提供模型 Model: Rumiii/Qwen2.5-0.5B-Medical-ReasonMed370K Source: Original Platform 2026-05-05 01:46:09 +08:00			`---`
			`license: apache-2.0`
			`datasets:`
			`- lingshu-medical-mllm/ReasonMed`
			`base_model:`
			`- unsloth/Qwen2.5-0.5B-Instruct`
			`---`
			`## Info`


			`![AKmUU](https://cdn-uploads.huggingface.co/production/uploads/66e00ba55e4fd4bfead4a97c/zxUf9YsQPkICX5-n1BVjc.jpeg)`

			`![Demo Screenshot](https://cdn-uploads.huggingface.co/production/uploads/66e00ba55e4fd4bfead4a97c/ZWQpqF2613W9Ty9NBYUk1.png)`

			`# Qwen2.5-0.5B-Medical-ReasonMed370K`

			`A 0.5 billion parameter medical reasoning model fine-tuned on the complete ReasonMed 370K dataset. This model is built on top of Qwen2.5-0.5B-Instruct and trained to perform structured clinical reasoning, differential diagnosis, and evidence-based medical question answering.`

			`## Model Details`

			`- Base Model: unsloth/Qwen2.5-0.5B-Instruct`
			`- Model Size: 0.5B parameters`
			`- Fine-tuning Method: LoRA via Unsloth`
			`- Training Dataset: ReasonMed 370K (full dataset)`
			`- Training Hardware: NVIDIA Tesla T4 (Kaggle free tier)`
			`- License: Apache 2.0`

			`## Training Details`

			`The model was fine-tuned in two stages, each covering half of the ReasonMed dataset:`

			`Stage 1: Fine-tuned on the first 185,000 samples of ReasonMed using LoRA with the following configuration:`
			`- LoRA rank: 8`
			`- LoRA alpha: 16`
			`- Learning rate: 5e-5`
			`- Batch size: 2 with 16 gradient accumulation steps`
			`- Max sequence length: 4096`
			`- Epochs: 1`
			`- Optimizer: AdamW 8-bit`

			`Stage 2: Continued fine-tuning on the remaining 184,983 samples with identical configuration, completing one full pass over the entire 370K dataset.`

			Both stages used `packing=False` to ensure every sample was processed individually without truncation.

			`## Dataset`

			`This model was trained on [ReasonMed](https://huggingface.co/datasets/lingshu-medical-mllm/ReasonMed), the largest open-source medical reasoning dataset available, comprising 370,000 high-quality examples distilled from 1.75 million initial reasoning paths generated by multiple large language models.`

			`ReasonMed is built through a multi-agent verification and refinement pipeline that includes an Error Refiner to correct error-prone reasoning steps. Each example combines detailed chain-of-thought reasoning with a concise answer summary, covering a wide range of medical topics including clinical reasoning, differential diagnosis, pharmacology, and medical question answering.`

			`For more details on the dataset, refer to the official repository: https://github.com/alibaba-damo-academy/ReasonMed`

			`## What the Model Can Do`

			`After training on the full ReasonMed dataset, the model demonstrates the ability to:`

			`- Work through clinical presentations step by step`
			`- Generate differential diagnoses with reasoning for each option`
			`- Rule out unlikely diagnoses with justification`
			`- Provide structured final answers with clinical pearls`
			`- Reason through medical multiple choice questions with explanation`

			`## Demo`

			`The screenshot above shows the model running through a clinical scenario involving hypothyroidism, demonstrating its ability to identify key symptoms, interpret lab values, and produce a structured response with management guidance.`

			`## Limitations`

			`- This is a 0.5B parameter model and has a hard ceiling on reasoning depth and factual recall`
			`- Small models are prone to inconsistency across similar questions`
			`- The model may occasionally hallucinate clinical details`
			`- This model is intended for research and educational purposes only`
			`- It should not be used for real clinical decision making or as a substitute for a qualified medical professional`

			`## Usage`
			```python
			`from unsloth import FastLanguageModel`
			`import torch`

			`model, tokenizer = FastLanguageModel.from_pretrained(`
			`model_name = "Rumiii/Qwen2.5-0.5B-Medical-ReasonMed370K",`
			`max_seq_length = 4096,`
			`load_in_4bit = True,`
			`)`
			`FastLanguageModel.for_inference(model)`

			`messages = [`
			`{"role": "user", "content": "Your medical question here"}`
			`]`

			`inputs = tokenizer.apply_chat_template(`
			`messages,`
			`tokenize = True,`
			`add_generation_prompt = True,`
			`return_tensors = "pt"`
			`).to("cuda")`

			`outputs = model.generate(`
			`input_ids = inputs,`
			`max_new_tokens = 1024,`
			`temperature = 0.7,`
			`do_sample = True,`
			`repetition_penalty = 1.3,`
			`no_repeat_ngram_size = 3,`
			`top_p = 0.9,`
			`top_k = 50,`
			`)`

			`print(tokenizer.decode(outputs[0], skip_special_tokens=True))`
			```

			`## Citation`

			`If you use this model, please cite the ReasonMed dataset:`
			```bibtex
			`@misc{sun2025reasonmed370kmultiagentgenerated,`
			`title={ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning},`
			`author={Yu Sun and Xingyu Qian and Weiwen Xu and Hao Zhang and Chenghao Xiao and Long Li and Yu Rong and Wenbing Huang and Qifeng Bai and Tingyang Xu},`
			`year={2025},`
			`eprint={2506.09513},`
			`archivePrefix={arXiv},`
			`primaryClass={cs.CL},`
			`url={https://arxiv.org/abs/2506.09513},`
			`}`
			```

			`## Acknowledgements`

			`Training was conducted on Kaggle free tier infrastructure using Unsloth for efficient fine-tuning. The ReasonMed dataset was created by the team at Alibaba DAMO Academy and Tencent AI Lab.`