Model: hmzBen/medgemma-1.5-medical-q4km Source: Original Platform
license, language, tags, library_name, pipeline_tag, base_model
| license | language | tags | library_name | pipeline_tag | base_model | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| other |
|
|
llama.cpp | text-generation | google/medgemma-1.5-4b-it |
MedGemma 1.5 Medical Q4_K_M (GGUF)
This repository hosts a GGUF export of MedGemma-1.5-4B-IT, quantized for efficient local inference.
Summary
- Base model:
google/medgemma-1.5-4b-it - Format: GGUF (for
llama.cpp) - Quantization:
Q4_K_M(mixed precision) - Intended use: local medical assistant workflows, triage support, structured extraction
Files
medgemma-1.5-medical-Q4_K_M.gguf(quantized)medgemma-1.5-4b-it-F16.gguf(optional, full precision)
Usage (llama.cpp)
./llama-server \
-m medgemma-1.5-medical-Q4_K_M.gguf \
--host 0.0.0.0 \
--port 8080 \
--alias medgemma
Usage (Python client)
from medgemma_client import MedGemmaAgent
agent = MedGemmaAgent(base_url="http://localhost:8080")
print(agent.generate_clinical_text("Patient has stiff neck and fever. What is the triage concern?"))
Quantization Notes
This GGUF was produced using an I-Matrix calibrated on a medical mixed-domain dataset:
- Doctor-patient dialogue
- Medical facts
- Diagnostic reasoning
The goal is to preserve clinical reasoning while reducing memory footprint.
Safety and Limitations
This model is not a substitute for professional medical advice. It can make mistakes and must be used with human oversight. Always validate outputs before use in clinical decision-making.
License
MedGemma is distributed under the Health AI Developer Foundations license by Google. Ensure your use and redistribution comply with the model terms:
Acknowledgments
- Google DeepMind for MedGemma
- ggml-org for llama.cpp
Description