Files
MedSSR-Qwen3-8B-Base/README.md

140 lines
3.8 KiB
Markdown
Raw Normal View History

---
base_model:
- Qwen/Qwen3-8B-Base
tags:
- medical
library_name: transformers
pipeline_tag: text-generation
---
# MedSSR-Qwen3-8B-Base
This is the model for our ACL 2026 Findings paper, "[Eliciting Medical Reasoning with Knowledge-enhanced Data Synthesis: A Semi-Supervised Reinforcement Learning Approach](https://huggingface.co/papers/2604.11547)". `MedSSR-Qwen3-8B-Base` is a medical reasoning-focused LLM built from `Qwen/Qwen3-8B-Base`.
## Model Summary
- **Base model**: `Qwen/Qwen3-8B-Base`
- **Model name**: `MedSSR-Qwen3-8B-Base`
- **Training framework**: [verl](https://github.com/verl-project/verl)
- **Paper link**: [https://arxiv.org/pdf/2604.11547](https://huggingface.co/papers/2604.11547)
- **Github repo**: [https://github.com/tdlhl/MedSSR](https://github.com/tdlhl/MedSSR)
- **Hugging Face training dataset**: [tdlhl/MedSSR-Synthetic-43K](https://huggingface.co/datasets/tdlhl/MedSSR-Synthetic-43K)
- **Hugging Face test dataset**: [tdlhl/RareDis-Sub](https://huggingface.co/datasets/tdlhl/RareDis-Sub)
## Quick Start (Transformers)
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "tdlhl/MedSSR-Qwen3-8B-Base"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
messages = [
{
"role": "user",
"content": (
"A 67-year-old man develops crushing substernal chest pain for 40 minutes. "
"ECG shows ST-segment elevation in leads II, III, and aVF. "
"Which coronary artery is most likely occluded?
"
"A. Left anterior descending artery
"
"B. Left circumflex artery
"
"C. Right coronary artery
"
"D. Posterior descending artery"
),
}
]
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=1024,
temperature=0.6,
top_p=0.95,
)
new_tokens = outputs[0][inputs["input_ids"].shape[-1]:]
print(tokenizer.decode(new_tokens, skip_special_tokens=True))
```
## Quick Start (vLLM)
```python
from vllm import LLM, SamplingParams
llm = LLM(
model="tdlhl/MedSSR-Qwen3-8B-Base",
trust_remote_code=True,
)
sampling = SamplingParams(
temperature=0.6,
top_p=0.95,
max_tokens=1024,
)
prompt = (
"A 24-year-old woman presents with fatigue, weight gain, constipation, and cold intolerance. "
"Which of the following laboratory findings is most consistent with primary hypothyroidism?
"
"A. Low TSH, low free T4
"
"B. High TSH, low free T4
"
"C. High TSH, high free T4
"
"D. Low TSH, high free T4"
)
outputs = llm.generate([prompt], sampling_params=sampling)
print(outputs[0].outputs[0].text)
```
## Suggested Decoding Setup
For evaluation settings similar to our paper, we follow the recommended setting of Qwen:
```text
temperature=0.6
top_p=0.95
top_k=20
max_tokens=2048
```
## Notes
- This model is intended for research use.
- The model may produce incorrect or unverifiable medical reasoning.
- Outputs should not be used as a substitute for professional medical judgment.
- For benchmark-style evaluation, please follow the released evaluation script in our repository.
## Citation
If you find our model useful, please cite our paper:
```bibtex
@article{li2025eliciting,
title={Eliciting Medical Reasoning with Knowledge-enhanced Data Synthesis: A Semi-Supervised Reinforcement Learning Approach},
author={Haolin Li, Shuyang Jiang, Ruipeng Zhang, Jiangchao Yao, Ya Zhang, Yanfeng Wang},
journal={arXiv preprint arXiv:2604.11547},
year={2026}
}
```