ModelHub XC b52904f4f5 初始化项目,由ModelHub XC社区提供模型
Model: hmzBen/medgemma-1.5-medical-q4km
Source: Original Platform
2026-06-09 08:43:15 +08:00

license, language, tags, library_name, pipeline_tag, base_model
license language tags library_name pipeline_tag base_model
other
en
gguf
llama.cpp
medgemma
medical
quantized
llama.cpp text-generation google/medgemma-1.5-4b-it

MedGemma 1.5 Medical Q4_K_M (GGUF)

This repository hosts a GGUF export of MedGemma-1.5-4B-IT, quantized for efficient local inference.

Summary

  • Base model: google/medgemma-1.5-4b-it
  • Format: GGUF (for llama.cpp)
  • Quantization: Q4_K_M (mixed precision)
  • Intended use: local medical assistant workflows, triage support, structured extraction

Files

  • medgemma-1.5-medical-Q4_K_M.gguf (quantized)
  • medgemma-1.5-4b-it-F16.gguf (optional, full precision)

Usage (llama.cpp)

./llama-server \
  -m medgemma-1.5-medical-Q4_K_M.gguf \
  --host 0.0.0.0 \
  --port 8080 \
  --alias medgemma

Usage (Python client)

from medgemma_client import MedGemmaAgent

agent = MedGemmaAgent(base_url="http://localhost:8080")
print(agent.generate_clinical_text("Patient has stiff neck and fever. What is the triage concern?"))

Quantization Notes

This GGUF was produced using an I-Matrix calibrated on a medical mixed-domain dataset:

  • Doctor-patient dialogue
  • Medical facts
  • Diagnostic reasoning

The goal is to preserve clinical reasoning while reducing memory footprint.

Safety and Limitations

This model is not a substitute for professional medical advice. It can make mistakes and must be used with human oversight. Always validate outputs before use in clinical decision-making.

License

MedGemma is distributed under the Health AI Developer Foundations license by Google. Ensure your use and redistribution comply with the model terms:

Acknowledgments

  • Google DeepMind for MedGemma
  • ggml-org for llama.cpp
Description
Model synced from source: hmzBen/medgemma-1.5-medical-q4km
Readme 25 KiB