80 lines
2.7 KiB
Markdown
80 lines
2.7 KiB
Markdown
|
|
---
|
||
|
|
library_name: transformers
|
||
|
|
tags:
|
||
|
|
- sft
|
||
|
|
- lora
|
||
|
|
- unsloth
|
||
|
|
license: apache-2.0
|
||
|
|
datasets:
|
||
|
|
- gretelai/symptom_to_diagnosis
|
||
|
|
language:
|
||
|
|
- en
|
||
|
|
base_model:
|
||
|
|
- unsloth/Qwen2.5-0.5B-Instruct
|
||
|
|
pipeline_tag: text-generation
|
||
|
|
---
|
||
|
|
|
||
|
|
# Model Card for SympQwen-0.5B
|
||
|
|
|
||
|
|
|
||
|
|
## Model Details
|
||
|
|
|
||
|
|
SympQwen-0.5B is a fine-tuned variant of the Qwen2.5-0.5B-Instruct language model—adapted specifically for the task of medical symptom-to-diagnosis mapping.
|
||
|
|
It is trained to generate plausible diagnoses from patient-like descriptions of symptoms, based on the labeled examples from the gretelai/symptom_to_diagnosis dataset.
|
||
|
|
This makes it suitable for assisting with clinical symptom interpretation in research or educational settings.
|
||
|
|
|
||
|
|
|
||
|
|
## Uses
|
||
|
|
|
||
|
|
### Direct Use
|
||
|
|
|
||
|
|
Primary Use Cases:
|
||
|
|
|
||
|
|
- Assisting medical students in practicing diagnostic reasoning.
|
||
|
|
- Providing clinicians or educators with a tool for generating potential diagnostic hypotheses from symptom descriptions.
|
||
|
|
- Serving as a base model for research in AI-based clinical decision support systems.
|
||
|
|
|
||
|
|
### Out-of-Scope Use
|
||
|
|
|
||
|
|
- Direct clinical diagnosis or patient self-assessment.
|
||
|
|
- Medical decision-making without oversight from qualified professionals.
|
||
|
|
- Use in high-stakes environments where erroneous diagnosis could result in harm.
|
||
|
|
|
||
|
|
## Bias, Risks, and Limitations
|
||
|
|
|
||
|
|
- Small Dataset: With just over 1,000 examples, the model may not generalize well to rare, atypical, or unseen symptom presentations.
|
||
|
|
- Imbalanced Labels: Some diagnoses (e.g., jaundice, migraine) are under-represented, which may bias outputs toward more frequent classes.
|
||
|
|
- Synthetic Language: Symptom descriptions are LLM-generated—not actual patient narratives—and may lack real-world variability or nuance.
|
||
|
|
- Not a Diagnostic Tool: This model is intended for research and educational augmentation only. It should not replace professional medical evaluation or diagnostic workflows.
|
||
|
|
|
||
|
|
## How to Get Started with the Model
|
||
|
|
|
||
|
|
Use the code below to get started with the model.
|
||
|
|
|
||
|
|
```python
|
||
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||
|
|
|
||
|
|
tokenizer = AutoTokenizer.from_pretrained("khazarai/SympQwen-0.5B")
|
||
|
|
model = AutoModelForCausalLM.from_pretrained(
|
||
|
|
"khazarai/SympQwen-0.5B",
|
||
|
|
device_map={"": 0}
|
||
|
|
)
|
||
|
|
|
||
|
|
question = "I have a rash on my skin that is itchy and has a different color than the rest of my skin. I also have some firm pimples or breakouts on my skin."
|
||
|
|
|
||
|
|
messages = [
|
||
|
|
{"role" : "user", "content" : question}
|
||
|
|
]
|
||
|
|
text = tokenizer.apply_chat_template(
|
||
|
|
messages,
|
||
|
|
tokenize = False,
|
||
|
|
add_generation_prompt = True,
|
||
|
|
)
|
||
|
|
|
||
|
|
from transformers import TextStreamer
|
||
|
|
_ = model.generate(
|
||
|
|
**tokenizer(text, return_tensors = "pt").to("cuda"),
|
||
|
|
max_new_tokens = 512,
|
||
|
|
streamer = TextStreamer(tokenizer, skip_prompt = True),
|
||
|
|
)
|
||
|
|
```
|