phi3-auditor-merged/README.md

---
license: mit
language:
- en
base_model:
- microsoft/Phi-3-mini-4k-instruct
pipeline_tag: text-generation
library_name: transformers
tags:
- phi3
- lora
- peft
- clinical-ai
- model-audit
- text-generation
- fine-tuned
- healthcare
- safetensors
---

# 🏥 phi3-auditor-merged

**Phi-3-mini fine-tuned for clinical AI model auditing.**

This model takes a JSON object of ML performance metrics (AUC, ECE, drift, label shift, etc.) and returns a structured health classification label plus a detailed explanation — helping teams audit deployed clinical models for drift, calibration failure, class imbalance, and other issues.

---

## Model Details

| Property | Value |
|---|---|
| **Base Model** | [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) |
| **Fine-tuning Method** | LoRA (Low-Rank Adaptation) via PEFT |
| **Training Precision** | 8-bit quantized (BitsAndBytesConfig) |
| **Merged Precision** | FP16 (float16 safetensors) |
| **Parameters** | ~3.8B |
| **Model Size** | 7.65 GB (2 safetensor shards) |
| **LoRA Rank (r)** | 16 |
| **LoRA Alpha** | 32 |
| **LoRA Dropout** | 0.05 |
| **Target Modules** | `q_proj`, `k_proj`, `v_proj`, `o_proj` |
| **Task Type** | Causal Language Modeling |
| **PEFT Version** | 0.18.0 |
| **Training Epochs** | 3 |
| **Final Loss** | ~0.41 |

---

## Intended Use

### What this model does

Given a JSON report of clinical ML model performance metrics, the model:

1. Assigns a **Category** label (e.g. `Calibration Failure`, `Major Drift`, `Class Imbalance Problem`, `Healthy`)
2. Generates a concise **Explanation** with observations and recommendations

### Intended users

- ML engineers monitoring deployed clinical models
- Healthcare data science teams running periodic model audits
- Researchers studying automated model health assessment

### Out-of-scope use

- Not suitable for direct clinical decision-making or patient diagnosis
- Not a replacement for domain expert review of model performance
- Not designed for non-clinical ML tasks
- Should not be used on data types outside its training distribution (non-tabular metrics, images, etc.)

---

## How to Use

### Requirements

```bash
pip install transformers torch accelerate
```

### Basic inference

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "PhantomAjusshi/phi3-auditor-merged"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
    device_map="auto",
    trust_remote_code=True,  # Required for custom Phi-3 modeling files
)

report = """{
  "auc": 0.863,
  "accuracy": 0.83,
  "precision": 0.79,
  "recall": 0.69,
  "f1": 0.79,
  "ece": 0.278,
  "brier": 0.263,
  "drift": 0.03,
  "missing_rate": 0.003,
  "label_shift": 0.06,
  "pos_rate": 0.10,
  "data_integrity_issues": 0
}"""

prompt = (
    f"<|system|>\nYou are a clinical AI auditor model.\n"
    f"<|user|>\nInstruction: Analyze the clinical model report and classify its health.\n\nReport:\n{report}\n"
    f"<|assistant|>\n"
)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.inference_mode():
    outputs = model.generate(
        **inputs,
        max_new_tokens=400,
        temperature=0.7,
        top_p=0.9,
        repetition_penalty=1.2,
        do_sample=True,
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
# Extract only the assistant's reply
reply = response.split("<|assistant|>")[-1].strip()
print(reply)
```

### Expected output format

```
Category: Calibration Failure
Explanation: High calibration error (ECE 0.278) despite reasonable discrimination (AUC 0.863).
The model's probability outputs are poorly aligned with actual outcomes. Recommend
recalibration using Platt scaling or isotonic regression, and threshold review.
```

### Input metrics reference

| Metric | Description |
|---|---|
| `auc` | Area Under the ROC Curve |
| `accuracy` | Overall classification accuracy |
| `precision` | Positive predictive value |
| `recall` | Sensitivity / true positive rate |
| `f1` | Harmonic mean of precision and recall |
| `ece` | Expected Calibration Error |
| `brier` | Brier score (probabilistic accuracy) |
| `drift` | Feature distribution drift score |
| `missing_rate` | Rate of missing input features |
| `label_shift` | Output label distribution shift |
| `pos_rate` | Positive prediction rate |
| `data_integrity_issues` | Count of detected data quality issues |

---

## Training Details

### Dataset

- **Name:** Custom synthetic clinical audit dataset (`audit_dataset_v2_5000.json`)
- **Size:** 5,000 labeled samples
- **Split:** 80% train (4,000) / 20% test (1,000)
- **Format:** JSONL — each record has `instruction`, `input` (metrics JSON), `output` (category + explanation)
- **Generation date:** November 17, 2025

Each sample pairs a set of synthetic model performance metrics with a human-written audit label and explanation covering categories such as:
- Healthy / Passing
- Calibration Failure
- Major Drift / Potential Drift
- Class Imbalance Problem
- Data Integrity Issue
- Needs Review / Critical Failure

### Training procedure

The base model was loaded in 8-bit using `BitsAndBytesConfig` and adapted with LoRA targeting the attention projection layers (`q_proj`, `k_proj`, `v_proj`, `o_proj`). After training, the LoRA adapter was merged into the base model weights using `peft.merge_and_unload()` and saved as full FP16 safetensors.

**Prompt format used during training:**

```
<|system|>
You are an AI auditor analyzing clinical model performance reports.
<|user|>
Instruction: Analyze the clinical model report and classify its health.

Report:
{ ...metrics JSON... }
<|assistant|>
Category: <label>
Explanation: <explanation>
```

### Hyperparameters

| Parameter | Value |
|---|---|
| Epochs | 3 |
| Batch size | 4 |
| Gradient accumulation steps | 4 |
| Effective batch size | 16 |
| Learning rate | 1e-4 |
| Warmup ratio | 0.1 |
| Max sequence length | 512 |
| Optimizer | AdamW (default) |
| Precision | FP16 (mixed) |

### Training loss

| Step | Epoch | Loss |
|---|---|---|
| 50 | 0.22 | 1.623 |
| 100 | 0.44 | 0.657 |
| 150 | 0.67 | 0.444 |
| 200 | 0.89 | 0.420 |
| 300 | 1.33 | 0.413 |
| 450 | 2.00 | 0.412 |
| 600 | 2.67 | 0.408 |
| 675 | 3.00 | ~0.410 |

Loss converged rapidly after the first 150 steps, stabilizing around 0.41 for the remainder of training.

---

## Evaluation

The model was evaluated on a held-out test set of 1,000 samples using weighted precision, recall, F1, and accuracy computed by extracting the `Category:` field from generated outputs and comparing to ground-truth labels.

> Formal evaluation metrics will be added here once a full benchmark run is completed.

---

## Limitations & Bias

- **Synthetic training data:** The model was trained entirely on synthetically generated audit reports. Real-world clinical model metrics may follow different distributions or contain edge cases not represented in training.
- **Label sensitivity:** The model may be sensitive to metric combinations near decision boundaries between categories.
- **No temporal reasoning:** The model does not reason about metric trends over time — each inference is based on a single snapshot of metrics.
- **English only:** All training data is in English.
- **Not a substitute for expert review:** Outputs should be treated as decision-support, not a final audit verdict.

---

## Repository & Related Work

- **Training code:** [Hospital-Audit-Trained-Model (GitHub)](https://github.com/PhantomAjusshi/Hospital-Audit-Trained-Model)
- **Web application:** [Hospital-Model-Audit-Website (GitHub)](https://github.com/PhantomAjusshi/Hospital-Model-Audit-Website) — a full-stack Next.js + FastAPI interface that uses this model via llama.cpp

---

## Citation

If you use this model in your work, please cite:

```bibtex
@misc{phi3-auditor-merged,
  author       = {PhantomAjusshi},
  title        = {phi3-auditor-merged: Phi-3-mini fine-tuned for clinical AI model auditing},
  year         = {2025},
  publisher    = {HuggingFace},
  url          = {https://huggingface.co/PhantomAjusshi/phi3-auditor-merged}
}
```

---

## License

This model is released under the **MIT License**.

The base model ([microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct)) is subject to Microsoft's Phi-3 license. Please review it before use in commercial or production settings.