初始化项目，由ModelHub XC社区提供模型

Model: Rumiii/Qwen2.5-0.5B-Medical-ReasonMed370K Source: Original Platform
2026-05-05 01:46:09 +08:00
commit 3112e5151b
7 changed files with 298 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,129 @@
+---
+license: apache-2.0
+datasets:
+- lingshu-medical-mllm/ReasonMed
+base_model:
+- unsloth/Qwen2.5-0.5B-Instruct
+---
+## Info
+
+
+![AKmUU](https://cdn-uploads.huggingface.co/production/uploads/66e00ba55e4fd4bfead4a97c/zxUf9YsQPkICX5-n1BVjc.jpeg)
+
+![Demo Screenshot](https://cdn-uploads.huggingface.co/production/uploads/66e00ba55e4fd4bfead4a97c/ZWQpqF2613W9Ty9NBYUk1.png)
+
+# Qwen2.5-0.5B-Medical-ReasonMed370K
+
+A 0.5 billion parameter medical reasoning model fine-tuned on the complete ReasonMed 370K dataset. This model is built on top of Qwen2.5-0.5B-Instruct and trained to perform structured clinical reasoning, differential diagnosis, and evidence-based medical question answering.
+
+## Model Details
+
+- **Base Model**: unsloth/Qwen2.5-0.5B-Instruct
+- **Model Size**: 0.5B parameters
+- **Fine-tuning Method**: LoRA via Unsloth
+- **Training Dataset**: ReasonMed 370K (full dataset)
+- **Training Hardware**: NVIDIA Tesla T4 (Kaggle free tier)
+- **License**: Apache 2.0
+
+## Training Details
+
+The model was fine-tuned in two stages, each covering half of the ReasonMed dataset:
+
+**Stage 1**: Fine-tuned on the first 185,000 samples of ReasonMed using LoRA with the following configuration:
+- LoRA rank: 8
+- LoRA alpha: 16
+- Learning rate: 5e-5
+- Batch size: 2 with 16 gradient accumulation steps
+- Max sequence length: 4096
+- Epochs: 1
+- Optimizer: AdamW 8-bit
+
+**Stage 2**: Continued fine-tuning on the remaining 184,983 samples with identical configuration, completing one full pass over the entire 370K dataset.
+
+Both stages used `packing=False` to ensure every sample was processed individually without truncation.
+
+## Dataset
+
+This model was trained on [ReasonMed](https://huggingface.co/datasets/lingshu-medical-mllm/ReasonMed), the largest open-source medical reasoning dataset available, comprising 370,000 high-quality examples distilled from 1.75 million initial reasoning paths generated by multiple large language models.
+
+ReasonMed is built through a multi-agent verification and refinement pipeline that includes an Error Refiner to correct error-prone reasoning steps. Each example combines detailed chain-of-thought reasoning with a concise answer summary, covering a wide range of medical topics including clinical reasoning, differential diagnosis, pharmacology, and medical question answering.
+
+For more details on the dataset, refer to the official repository: https://github.com/alibaba-damo-academy/ReasonMed
+
+## What the Model Can Do
+
+After training on the full ReasonMed dataset, the model demonstrates the ability to:
+
+- Work through clinical presentations step by step
+- Generate differential diagnoses with reasoning for each option
+- Rule out unlikely diagnoses with justification
+- Provide structured final answers with clinical pearls
+- Reason through medical multiple choice questions with explanation
+
+## Demo
+
+The screenshot above shows the model running through a clinical scenario involving hypothyroidism, demonstrating its ability to identify key symptoms, interpret lab values, and produce a structured response with management guidance.
+
+## Limitations
+
+- This is a 0.5B parameter model and has a hard ceiling on reasoning depth and factual recall
+- Small models are prone to inconsistency across similar questions
+- The model may occasionally hallucinate clinical details
+- This model is intended for research and educational purposes only
+- It should not be used for real clinical decision making or as a substitute for a qualified medical professional
+
+## Usage
+```python
+from unsloth import FastLanguageModel
+import torch
+
+model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name     = "Rumiii/Qwen2.5-0.5B-Medical-ReasonMed370K",
+    max_seq_length = 4096,
+    load_in_4bit   = True,
+)
+FastLanguageModel.for_inference(model)
+
+messages = [
+    {"role": "user", "content": "Your medical question here"}
+]
+
+inputs = tokenizer.apply_chat_template(
+    messages,
+    tokenize              = True,
+    add_generation_prompt = True,
+    return_tensors        = "pt"
+).to("cuda")
+
+outputs = model.generate(
+    input_ids            = inputs,
+    max_new_tokens       = 1024,
+    temperature          = 0.7,
+    do_sample            = True,
+    repetition_penalty   = 1.3,
+    no_repeat_ngram_size = 3,
+    top_p                = 0.9,
+    top_k                = 50,
+)
+
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+
+## Citation
+
+If you use this model, please cite the ReasonMed dataset:
+```bibtex
+@misc{sun2025reasonmed370kmultiagentgenerated,
+      title={ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning}, 
+      author={Yu Sun and Xingyu Qian and Weiwen Xu and Hao Zhang and Chenghao Xiao and Long Li and Yu Rong and Wenbing Huang and Qifeng Bai and Tingyang Xu},
+      year={2025},
+      eprint={2506.09513},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2506.09513}, 
+}
+```
+
+## Acknowledgements
+
+Training was conducted on Kaggle free tier infrastructure using Unsloth for efficient fine-tuning. The ReasonMed dataset was created by the team at Alibaba DAMO Academy and Tencent AI Lab.