Files
medfit-llm-3B/README.md
2025-09-15 17:20:23 +00:00

8.2 KiB

license, language, base_model, library_name, tags, pipeline_tag, datasets
license language base_model library_name tags pipeline_tag datasets
llama3.2
en
meta-llama/Llama-3.2-3B-Instruct
transformers
healthcare
medical
fine-tuned
chatbot
medical-qa
domain-specific
lora
healthcare-ai
text-generation
synthetic-medical-qa

MEDFIT-LLM-3B: Fine-tuned Llama-3.2-3B for Medical QA

MEDFIT-LLM-3B is a specialized language model fine-tuned from Meta's Llama-3.2-3B-Instruct for healthcare and medical question-answering applications. This model demonstrates significant improvements in direct answer capabilities and medical domain understanding through domain-focused fine-tuning.

Model Details

Model Description

MEDFIT-LLM-3B is a 3 billion parameter language model specifically optimized for healthcare chatbot applications. The model was fine-tuned using LoRA (Low-Rank Adaptation) techniques on a carefully curated dataset of healthcare-related questions and answers, resulting in enhanced performance for medical information dissemination and patient education.

  • Developed by: Aditya Karnam Gururaj Rao, Arjun Jaggi, Sonam Naidu
  • Model type: Causal Language Model (Fine-tuned)
  • Language(s): English
  • License: Llama 3.2 Community License
  • Finetuned from model: meta-llama/Llama-3.2-3B-Instruct
  • Fine-tuning method: LoRA (Low-Rank Adaptation)
  • Training framework: MLX

Model Sources

Performance Highlights

Based on comprehensive evaluation against the base Llama-3.2-3B-Instruct model:

  • Direct Answer Improvement: 30 percentage point increase (from 6.0% to 36.0%)
  • Response Structure: 18% increase in numbered list usage for better organization
  • Overall Improvement Score: 108.2 (highest among evaluated models)
  • Response Length: Slight increase (+2.84%) with more comprehensive answers

Uses

Direct Use

MEDFIT-LLM-3B is designed for healthcare chatbot applications where accurate, well-structured medical information delivery is crucial. The model excels at:

  • Medical Question Answering: Providing direct, accurate responses to healthcare queries
  • Patient Education: Delivering structured, easy-to-understand medical information
  • Healthcare Information Dissemination: Supporting healthcare providers with reliable AI assistance
  • Medical Chatbot Applications: Serving as the backbone for healthcare conversational agents

Downstream Use

The model can be integrated into:

  • Healthcare mobile applications
  • Medical information systems
  • Patient support platforms
  • Telemedicine chatbots
  • Medical education tools

Out-of-Scope Use

Important: This model is NOT intended for:

  • Medical diagnosis or treatment recommendations
  • Emergency medical situations
  • Replacement of professional medical advice
  • Clinical decision-making without human oversight
  • Prescription or medication recommendations

Training Details

Training Data

The model was trained on a carefully curated dataset comprising:

  • Total samples: 6,444 unique healthcare-related question-answer pairs
  • Training set: 5,155 samples
  • Validation set: 644 samples
  • Test set: 645 samples

The dataset was created using:

  • Synthetic data generation: 10,000 initial samples generated using Phi-4
  • Domain-specific curation: Healthcare-focused questions derived from existing research
  • Deduplication: Filtered to remove duplicates, resulting in 6,444 unique samples

Training Procedure

Fine-tuning Method

  • Technique: LoRA (Low-Rank Adaptation)
  • Framework: MLX
  • Base model: meta-llama/Llama-3.2-3B-Instruct
  • Focus: Healthcare domain specialization

Training Hyperparameters

  • Fine-tuning approach: Domain-focused LoRA adaptation
  • Dataset split: 80% training, 10% validation, 10% testing
  • Training regime: Optimized for healthcare question-answering performance

Evaluation

Testing Data & Metrics

The model was evaluated on:

  • 50 healthcare-specific validation questions
  • Comparative analysis against base Llama-3.2-3B-Instruct
  • Multi-dimensional assessment including direct answer capability, response structure, and generation efficiency

Key Results

Direct Answer Performance:

  • Base model: 6.0% direct answer rate
  • Fine-tuned model: 36.0% direct answer rate
  • Improvement: +30.0 percentage points

Response Quality:

  • Enhanced structure with increased use of numbered lists (+18%)
  • Improved organization and systematic presentation
  • Better alignment with healthcare communication standards

Generation Efficiency:

  • Slight increase in generation time (+1.6%)
  • Trade-off between response quality and speed
  • Overall positive impact on response comprehensiveness

Bias, Risks, and Limitations

Limitations

  • Not a substitute for professional medical advice
  • May generate plausible-sounding but incorrect medical information
  • Limited to English language medical contexts
  • Training data may not cover all medical specialties equally
  • Performance may vary across different healthcare subdomains

Recommendations

  • Always verify medical information with qualified healthcare professionals
  • Use as a supplementary tool rather than primary medical resource
  • Implement human oversight in all healthcare applications
  • Regular updates needed to maintain medical accuracy as knowledge evolves
  • Consider integration with retrieval-augmented generation (RAG) for enhanced factual accuracy

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("adityak74/medfit-llm-3B")
model = AutoModelForCausalLM.from_pretrained("adityak74/medfit-llm-3B")

# Example usage
prompt = "What are the common symptoms of diabetes?"
inputs = tokenizer.encode(prompt, return_tensors="pt")
outputs = model.generate(inputs, max_length=200, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Environmental Impact

The fine-tuning process utilized efficient LoRA techniques to minimize computational requirements while maximizing performance improvements. This approach reduces the environmental impact compared to full model training while achieving significant domain-specific enhancements.

Citation

BibTeX:

@inproceedings{rao2025medfit,
  title={MEDFIT-LLM: Medical Enhancements through Domain-Focused Fine Tuning of Small Language Models},
  author={Rao, Aditya Karnam Gururaj and Jaggi, Arjun and Naidu, Sonam},
  booktitle={2025 2nd International Conference on Research Methodologies in Knowledge Management, Artificial Intelligence and Telecommunication Engineering (RMKMATE)},
  year={2025},
  organization={IEEE}
}

APA: Rao, A. K. G., Jaggi, A., & Naidu, S. (2025). MEDFIT-LLM: Medical Enhancements through Domain-Focused Fine Tuning of Small Language Models. In 2025 2nd International Conference on Research Methodologies in Knowledge Management, Artificial Intelligence and Telecommunication Engineering (RMKMATE). IEEE.

Glossary

  • QA: Question Answering
  • EHR: Electronic Health Record
  • LoRA: Low-Rank Adaptation - an efficient fine-tuning technique
  • MLX: Machine Learning framework used for training
  • Direct Answer Rate: Percentage of responses that begin with direct answers rather than preambles

Model Card Authors

  • Aditya Karnam Gururaj Rao (Zefr Inc, LA, USA)
  • Arjun Jaggi (HCLTech, LA, USA)
  • Sonam Naidu (LexisNexis, USA)

Model Card Contact

Disclaimer

This model is designed for educational and informational purposes in healthcare contexts. It should not be used as a substitute for professional medical advice, diagnosis, or treatment. Always seek the advice of qualified healthcare providers with questions regarding medical conditions or treatments.