140 lines
3.8 KiB
Markdown
140 lines
3.8 KiB
Markdown
---
|
|
base_model:
|
|
- Qwen/Qwen3-8B-Base
|
|
tags:
|
|
- medical
|
|
library_name: transformers
|
|
pipeline_tag: text-generation
|
|
---
|
|
|
|
# MedSSR-Qwen3-8B-Base
|
|
|
|
This is the model for our ACL 2026 Findings paper, "[Eliciting Medical Reasoning with Knowledge-enhanced Data Synthesis: A Semi-Supervised Reinforcement Learning Approach](https://huggingface.co/papers/2604.11547)". `MedSSR-Qwen3-8B-Base` is a medical reasoning-focused LLM built from `Qwen/Qwen3-8B-Base`.
|
|
|
|
## Model Summary
|
|
|
|
- **Base model**: `Qwen/Qwen3-8B-Base`
|
|
- **Model name**: `MedSSR-Qwen3-8B-Base`
|
|
- **Training framework**: [verl](https://github.com/verl-project/verl)
|
|
- **Paper link**: [https://arxiv.org/pdf/2604.11547](https://huggingface.co/papers/2604.11547)
|
|
- **Github repo**: [https://github.com/tdlhl/MedSSR](https://github.com/tdlhl/MedSSR)
|
|
- **Hugging Face training dataset**: [tdlhl/MedSSR-Synthetic-43K](https://huggingface.co/datasets/tdlhl/MedSSR-Synthetic-43K)
|
|
- **Hugging Face test dataset**: [tdlhl/RareDis-Sub](https://huggingface.co/datasets/tdlhl/RareDis-Sub)
|
|
|
|
## Quick Start (Transformers)
|
|
|
|
```python
|
|
import torch
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
|
model_id = "tdlhl/MedSSR-Qwen3-8B-Base"
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
|
|
model = AutoModelForCausalLM.from_pretrained(
|
|
model_id,
|
|
torch_dtype=torch.bfloat16,
|
|
device_map="auto",
|
|
trust_remote_code=True,
|
|
)
|
|
|
|
messages = [
|
|
{
|
|
"role": "user",
|
|
"content": (
|
|
"A 67-year-old man develops crushing substernal chest pain for 40 minutes. "
|
|
"ECG shows ST-segment elevation in leads II, III, and aVF. "
|
|
"Which coronary artery is most likely occluded?
|
|
"
|
|
"A. Left anterior descending artery
|
|
"
|
|
"B. Left circumflex artery
|
|
"
|
|
"C. Right coronary artery
|
|
"
|
|
"D. Posterior descending artery"
|
|
),
|
|
}
|
|
]
|
|
|
|
prompt = tokenizer.apply_chat_template(
|
|
messages,
|
|
tokenize=False,
|
|
add_generation_prompt=True,
|
|
)
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
|
|
|
with torch.no_grad():
|
|
outputs = model.generate(
|
|
**inputs,
|
|
max_new_tokens=1024,
|
|
temperature=0.6,
|
|
top_p=0.95,
|
|
)
|
|
|
|
new_tokens = outputs[0][inputs["input_ids"].shape[-1]:]
|
|
print(tokenizer.decode(new_tokens, skip_special_tokens=True))
|
|
```
|
|
|
|
## Quick Start (vLLM)
|
|
|
|
```python
|
|
from vllm import LLM, SamplingParams
|
|
|
|
llm = LLM(
|
|
model="tdlhl/MedSSR-Qwen3-8B-Base",
|
|
trust_remote_code=True,
|
|
)
|
|
|
|
sampling = SamplingParams(
|
|
temperature=0.6,
|
|
top_p=0.95,
|
|
max_tokens=1024,
|
|
)
|
|
|
|
prompt = (
|
|
"A 24-year-old woman presents with fatigue, weight gain, constipation, and cold intolerance. "
|
|
"Which of the following laboratory findings is most consistent with primary hypothyroidism?
|
|
"
|
|
"A. Low TSH, low free T4
|
|
"
|
|
"B. High TSH, low free T4
|
|
"
|
|
"C. High TSH, high free T4
|
|
"
|
|
"D. Low TSH, high free T4"
|
|
)
|
|
|
|
outputs = llm.generate([prompt], sampling_params=sampling)
|
|
print(outputs[0].outputs[0].text)
|
|
```
|
|
|
|
## Suggested Decoding Setup
|
|
|
|
For evaluation settings similar to our paper, we follow the recommended setting of Qwen:
|
|
|
|
```text
|
|
temperature=0.6
|
|
top_p=0.95
|
|
top_k=20
|
|
max_tokens=2048
|
|
```
|
|
|
|
## Notes
|
|
|
|
- This model is intended for research use.
|
|
- The model may produce incorrect or unverifiable medical reasoning.
|
|
- Outputs should not be used as a substitute for professional medical judgment.
|
|
- For benchmark-style evaluation, please follow the released evaluation script in our repository.
|
|
|
|
## Citation
|
|
|
|
If you find our model useful, please cite our paper:
|
|
|
|
```bibtex
|
|
@article{li2025eliciting,
|
|
title={Eliciting Medical Reasoning with Knowledge-enhanced Data Synthesis: A Semi-Supervised Reinforcement Learning Approach},
|
|
author={Haolin Li, Shuyang Jiang, Ruipeng Zhang, Jiangchao Yao, Ya Zhang, Yanfeng Wang},
|
|
journal={arXiv preprint arXiv:2604.11547},
|
|
year={2026}
|
|
}
|
|
``` |