--- library_name: transformers tags: - unsloth - trl - sft license: apache-2.0 language: - en base_model: - meta-llama/Llama-3.2-3B-Instruct pipeline_tag: text-generation metrics: - accuracy - bleu - rouge --- # Model Card for MediLlama-3.2 A fine-tuned version of Meta's LLaMA 3.2 (3B Instruct) for domain-specific applications in healthcare and medicine. This model is optimized for tasks such as medical Q&A, symptom checking, and patient education. ## Model Details ### Model Description This model is a domain-adapted version of LLaMA 3.2 3B Instruct. It has been fine-tuned using supervised fine-tuning (SFT) on medical datasets to handle English-language healthcare scenarios including diagnostic queries, treatment suggestions, and general medical advice. - **Developed by:** InferenceLab - **Model type:** Medical Chatbot - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Finetuned from model:** meta-llama/Llama-3.2-3B-Instruct ## Uses ### Direct Use MediLlama-3.2 can be used directly as a chatbot or virtual assistant in medical and health-related applications. Ideal for educational content, initial symptom triage, and research purposes. ### Downstream Use Can be integrated into larger telehealth systems, clinical documentation tools, or diagnostic assistants after further task-specific fine-tuning. ### Out-of-Scope Use - Should not be used for real-time diagnosis or treatment decisions without expert validation. - Not suitable for high-risk or life-threatening emergency response. - Not trained on pediatric or highly specialized medical domains. ## Bias, Risks, and Limitations While the model is trained on medical data, it may still exhibit: - Biases from source data - Hallucinations or incorrect suggestions - Outdated or non-region-specific medical advice ### Recommendations Users should validate outputs with certified medical professionals. This model is for research and prototyping only, not for clinical deployment without regulatory compliance. ## How to Get Started with the Model ```python import torch from transformers import pipeline model_id = "InferenceLab/MediLlama-3.2" pipe = pipeline( "text-generation", model=model_id, torch_dtype=torch.bfloat16, device_map="auto", ) messages = [ {"role": "system", "content": "You are a helpful Medical assistant."}, {"role": "user", "content": "Hi! How are you?"}, ] outputs = pipe( messages, max_new_tokens=256, ) print(outputs[0]["generated_text"][-1]) ```` ## Training Details ### Training Data Model trained using cleaned and preprocessed medical QA datasets, synthetic doctor-patient conversations, and publicly available health forums. Protected health information (PHI) was removed. ### Training Procedure Supervised fine-tuning (SFT) using TRL and Unsloth libraries. #### Preprocessing Tokenization using LLaMA tokenizer with special medical instruction formatting. #### Training Hyperparameters * **Training regime:** bf16 mixed precision * **Learning rate:** 1e-5 #### Speeds, Sizes, Times * **Training time:** \~12 hours on 4×A100 GPUs ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data Subset of unseen medical QA pairs, synthetic test cases, and MedQA-derived examples. #### Factors * Input prompt complexity * Use of medical terminology * Chat length #### Metrics * **Accuracy:** 81.3% * **BLEU:** 34.5 * **ROUGE-L:** 62.2 ### Results #### Summary Model shows good generalization to unseen prompts and performs competitively for general medical dialogue. Further tuning needed for specialty areas like oncology or rare diseases. ## Model Examination Explainability tools like LLaMA-MedLens (if available) are suggested to interpret model decisions. ## Environmental Impact * **Hardware Type:** 4×NVIDIA A100 40GB * **Hours used:** 12 * **Cloud Provider:** AWS * **Compute Region:** us-west-2 * **Carbon Emitted:** \~35.8 kg CO2eq (estimated) ## Technical Specifications ### Model Architecture and Objective * Based on Meta LLaMA 3.2 3B Instruct * Decoder-only transformer * Objective: Causal Language Modeling (CLM) with instruction fine-tuning ### Compute Infrastructure #### Hardware * 4×NVIDIA A100 40GB #### Software * Python 3.10 * Transformers (v4.40+) * TRL * Unsloth * PyTorch 2.1 ## Glossary * **SFT**: Supervised Fine-Tuning * **BLEU**: Bilingual Evaluation Understudy * **ROUGE**: Recall-Oriented Understudy for Gisting Evaluation ## More Information For collaborations, deployment help, or fine-tuning extensions, please contact the developers. ## Model Card Authors * InferenceLab Team