--- library_name: transformers tags: - sft - lora - unsloth license: apache-2.0 datasets: - gretelai/symptom_to_diagnosis language: - en base_model: - unsloth/Qwen2.5-0.5B-Instruct pipeline_tag: text-generation --- # Model Card for SympQwen-0.5B ## Model Details SympQwen-0.5B is a fine-tuned variant of the Qwen2.5-0.5B-Instruct language model—adapted specifically for the task of medical symptom-to-diagnosis mapping. It is trained to generate plausible diagnoses from patient-like descriptions of symptoms, based on the labeled examples from the gretelai/symptom_to_diagnosis dataset. This makes it suitable for assisting with clinical symptom interpretation in research or educational settings. ## Uses ### Direct Use Primary Use Cases: - Assisting medical students in practicing diagnostic reasoning. - Providing clinicians or educators with a tool for generating potential diagnostic hypotheses from symptom descriptions. - Serving as a base model for research in AI-based clinical decision support systems. ### Out-of-Scope Use - Direct clinical diagnosis or patient self-assessment. - Medical decision-making without oversight from qualified professionals. - Use in high-stakes environments where erroneous diagnosis could result in harm. ## Bias, Risks, and Limitations - Small Dataset: With just over 1,000 examples, the model may not generalize well to rare, atypical, or unseen symptom presentations. - Imbalanced Labels: Some diagnoses (e.g., jaundice, migraine) are under-represented, which may bias outputs toward more frequent classes. - Synthetic Language: Symptom descriptions are LLM-generated—not actual patient narratives—and may lack real-world variability or nuance. - Not a Diagnostic Tool: This model is intended for research and educational augmentation only. It should not replace professional medical evaluation or diagnostic workflows. ## How to Get Started with the Model Use the code below to get started with the model. ```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("khazarai/SympQwen-0.5B") model = AutoModelForCausalLM.from_pretrained( "khazarai/SympQwen-0.5B", device_map={"": 0} ) question = "I have a rash on my skin that is itchy and has a different color than the rest of my skin. I also have some firm pimples or breakouts on my skin." messages = [ {"role" : "user", "content" : question} ] text = tokenizer.apply_chat_template( messages, tokenize = False, add_generation_prompt = True, ) from transformers import TextStreamer _ = model.generate( **tokenizer(text, return_tensors = "pt").to("cuda"), max_new_tokens = 512, streamer = TextStreamer(tokenizer, skip_prompt = True), ) ```