--- license: mit language: - en base_model: - microsoft/Phi-3-mini-4k-instruct pipeline_tag: text-generation library_name: transformers tags: - phi3 - lora - peft - clinical-ai - model-audit - text-generation - fine-tuned - healthcare - safetensors --- # 🏥 phi3-auditor-merged **Phi-3-mini fine-tuned for clinical AI model auditing.** This model takes a JSON object of ML performance metrics (AUC, ECE, drift, label shift, etc.) and returns a structured health classification label plus a detailed explanation — helping teams audit deployed clinical models for drift, calibration failure, class imbalance, and other issues. --- ## Model Details | Property | Value | |---|---| | **Base Model** | [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) | | **Fine-tuning Method** | LoRA (Low-Rank Adaptation) via PEFT | | **Training Precision** | 8-bit quantized (BitsAndBytesConfig) | | **Merged Precision** | FP16 (float16 safetensors) | | **Parameters** | ~3.8B | | **Model Size** | 7.65 GB (2 safetensor shards) | | **LoRA Rank (r)** | 16 | | **LoRA Alpha** | 32 | | **LoRA Dropout** | 0.05 | | **Target Modules** | `q_proj`, `k_proj`, `v_proj`, `o_proj` | | **Task Type** | Causal Language Modeling | | **PEFT Version** | 0.18.0 | | **Training Epochs** | 3 | | **Final Loss** | ~0.41 | --- ## Intended Use ### What this model does Given a JSON report of clinical ML model performance metrics, the model: 1. Assigns a **Category** label (e.g. `Calibration Failure`, `Major Drift`, `Class Imbalance Problem`, `Healthy`) 2. Generates a concise **Explanation** with observations and recommendations ### Intended users - ML engineers monitoring deployed clinical models - Healthcare data science teams running periodic model audits - Researchers studying automated model health assessment ### Out-of-scope use - Not suitable for direct clinical decision-making or patient diagnosis - Not a replacement for domain expert review of model performance - Not designed for non-clinical ML tasks - Should not be used on data types outside its training distribution (non-tabular metrics, images, etc.) --- ## How to Use ### Requirements ```bash pip install transformers torch accelerate ``` ### Basic inference ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM model_id = "PhantomAjusshi/phi3-auditor-merged" tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32, device_map="auto", trust_remote_code=True, # Required for custom Phi-3 modeling files ) report = """{ "auc": 0.863, "accuracy": 0.83, "precision": 0.79, "recall": 0.69, "f1": 0.79, "ece": 0.278, "brier": 0.263, "drift": 0.03, "missing_rate": 0.003, "label_shift": 0.06, "pos_rate": 0.10, "data_integrity_issues": 0 }""" prompt = ( f"<|system|>\nYou are a clinical AI auditor model.\n" f"<|user|>\nInstruction: Analyze the clinical model report and classify its health.\n\nReport:\n{report}\n" f"<|assistant|>\n" ) inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.inference_mode(): outputs = model.generate( **inputs, max_new_tokens=400, temperature=0.7, top_p=0.9, repetition_penalty=1.2, do_sample=True, ) response = tokenizer.decode(outputs[0], skip_special_tokens=True) # Extract only the assistant's reply reply = response.split("<|assistant|>")[-1].strip() print(reply) ``` ### Expected output format ``` Category: Calibration Failure Explanation: High calibration error (ECE 0.278) despite reasonable discrimination (AUC 0.863). The model's probability outputs are poorly aligned with actual outcomes. Recommend recalibration using Platt scaling or isotonic regression, and threshold review. ``` ### Input metrics reference | Metric | Description | |---|---| | `auc` | Area Under the ROC Curve | | `accuracy` | Overall classification accuracy | | `precision` | Positive predictive value | | `recall` | Sensitivity / true positive rate | | `f1` | Harmonic mean of precision and recall | | `ece` | Expected Calibration Error | | `brier` | Brier score (probabilistic accuracy) | | `drift` | Feature distribution drift score | | `missing_rate` | Rate of missing input features | | `label_shift` | Output label distribution shift | | `pos_rate` | Positive prediction rate | | `data_integrity_issues` | Count of detected data quality issues | --- ## Training Details ### Dataset - **Name:** Custom synthetic clinical audit dataset (`audit_dataset_v2_5000.json`) - **Size:** 5,000 labeled samples - **Split:** 80% train (4,000) / 20% test (1,000) - **Format:** JSONL — each record has `instruction`, `input` (metrics JSON), `output` (category + explanation) - **Generation date:** November 17, 2025 Each sample pairs a set of synthetic model performance metrics with a human-written audit label and explanation covering categories such as: - Healthy / Passing - Calibration Failure - Major Drift / Potential Drift - Class Imbalance Problem - Data Integrity Issue - Needs Review / Critical Failure ### Training procedure The base model was loaded in 8-bit using `BitsAndBytesConfig` and adapted with LoRA targeting the attention projection layers (`q_proj`, `k_proj`, `v_proj`, `o_proj`). After training, the LoRA adapter was merged into the base model weights using `peft.merge_and_unload()` and saved as full FP16 safetensors. **Prompt format used during training:** ``` <|system|> You are an AI auditor analyzing clinical model performance reports. <|user|> Instruction: Analyze the clinical model report and classify its health. Report: { ...metrics JSON... } <|assistant|> Category: