129 lines
6.4 KiB
Markdown
129 lines
6.4 KiB
Markdown
---
|
||
license: apache-2.0
|
||
language: en
|
||
library_name: transformers
|
||
tags:
|
||
- text-generation
|
||
- json
|
||
- qol
|
||
- hernia
|
||
- healthcare
|
||
- llama-3
|
||
- fine-tuned
|
||
- 8k-context
|
||
base_model: meta-llama/Meta-Llama-3-8B-Instruct
|
||
---
|
||
|
||
# Llama-3-8B-Hernia-Analyst-600-Patients-8k
|
||
|
||
This is a specialized, fine-tuned version of `meta-llama/Meta-Llama-3-8B-Instruct`, designed to function as an expert "AI Research Assistant" for analyzing patient narratives related to Abdominal Wall Hernia (AWH).
|
||
|
||
This model represents a significant upgrade over previous versions, as it was fine-tuned on a larger dataset of **600 synthetic patients** and trained using the **full 8192 token context window**. This enables it to analyze longer, more complex patient narratives without truncation, resulting in a more accurate and comprehensive analysis.
|
||
|
||
The model's primary function is to take unstructured, free-text patient stories as input and transform them into a structured, multi-level JSON output. This output adheres to a specific Quality of Life (QoL) framework derived from clinical research, notably the work published in *Hernia (2022) 26:795–808*.
|
||
|
||
## Model Description
|
||
|
||
The core objective of this model is to automate and standardize the process of qualitative analysis for patient-reported outcomes. It has been trained to identify and structure information across five key domains:
|
||
- Body Image
|
||
- Mental Health
|
||
- Symptoms and Function
|
||
- Interpersonal Relationships
|
||
- Employment
|
||
|
||
The model produces a detailed JSON object that includes an executive summary, a ranked list of the most prominent QoL domains, and a deep-dive analysis for each domain, identifying relevant subthemes and clinical concepts mentioned by the patient.
|
||
|
||
## Intended Use
|
||
|
||
This model is intended for **research and prototyping purposes only**. Its primary use case is to process long-form patient narratives (e.g., from detailed interview transcripts or comprehensive questionnaires) and generate a structured, machine-readable analysis. This can be used for large-scale research, data visualization, or to assist clinicians in rapidly understanding the key QoL issues for a patient.
|
||
|
||
**Disclaimer: This is not a medical device.** The output should not be used for clinical diagnosis, treatment decisions, or any direct patient care without verification and interpretation by a qualified healthcare professional.
|
||
|
||
## How to Use
|
||
|
||
The model expects prompts formatted in the Llama 3 Instruct template. The following Python code demonstrates how to load the model and run inference on a new patient narrative, making it a powerful tool for offline analysis.
|
||
|
||
```python
|
||
# This installs specific, stable versions of the libraries known to work well together
|
||
# in the Colab environment.
|
||
|
||
!pip uninstall -y sentence-transformers
|
||
!pip install torch==2.3.1+cu121 torchvision==0.18.1+cu121 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121
|
||
!pip install -q "transformers==4.43.2" "datasets==2.18.0" "accelerate==0.29.3" "peft==0.10.0" "bitsandbytes==0.43.1" "trl==0.8.6" "protobuf==3.20.3"
|
||
!pip install -q einops scipy sentencepiece tensorboard
|
||
|
||
# # After installation, we need to restart the runtime one time for the changes to take effect.
|
||
# # This is a standard procedure in Colab.
|
||
import os
|
||
os.kill(os.getpid(), 9)
|
||
|
||
import torch
|
||
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
|
||
import json
|
||
|
||
# --- 1. CONFIGURATION ---
|
||
# The unique ID of your powerful, 8k-context model on the Hugging Face Hub
|
||
model_name = "Laxmikant17/Llama-3-8B-Hernia-Analyst-600-Patients-8k"
|
||
|
||
# --- 2. LOAD MODEL AND TOKENIZER ---
|
||
print(f"Loading fine-tuned model: {model_name}")
|
||
|
||
# Use 4-bit quantization for efficient inference on consumer GPUs (like in Colab)
|
||
bnb_config = BitsAndBytesConfig(
|
||
load_in_4bit=True,
|
||
bnb_4bit_quant_type="nf4",
|
||
bnb_4bit_compute_dtype=torch.bfloat16
|
||
)
|
||
|
||
# Load the model from the Hub with quantization
|
||
model = AutoModelForCausalLM.from_pretrained(
|
||
model_name,
|
||
quantization_config=bnb_config,
|
||
device_map="auto" # Automatically use the GPU if available
|
||
)
|
||
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
||
model.eval() # Set the model to evaluation mode
|
||
|
||
print("✅ Model loaded successfully!")
|
||
|
||
# --- 3. PREPARE YOUR INPUT ---
|
||
# This can be a very long and detailed patient narrative
|
||
test_narrative = """
|
||
The pain is the worst part. It's a constant, burning sensation that gets worse when I stand for more than ten minutes. I can't even lift my grocery bags without feeling a sharp pull. I also feel deformed. I avoid looking at myself without a shirt on. I just want to feel normal again. It's been really tough mentally. I feel a sense of dread every morning when I wake up, just knowing the discomfort is waiting for me. I've become irritable and I'm not pleasant to be around, which is straining my relationship with my family.
|
||
"""
|
||
|
||
# Format the input using the exact Llama 3 Instruct template the model was trained on
|
||
instruction = "Your sole function is to be a structured data generator. Analyze the patient narrative and produce a single, valid JSON object as your only output. Adhere strictly to the required format and terminology from the provided knowledge base."
|
||
prompt = f"<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\n{instruction}\n\n**Patient Narrative:**\n{test_narrative}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
|
||
|
||
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
||
|
||
# --- 4. GENERATE ANALYSIS ---
|
||
print("\n🚀 Generating analysis...")
|
||
with torch.no_grad():
|
||
outputs = model.generate(
|
||
**inputs,
|
||
max_new_tokens=4096, # Give the model plenty of space for its JSON output
|
||
do_sample=False,
|
||
pad_token_id=tokenizer.eos_token_id
|
||
)
|
||
|
||
# Robustly extract and print the JSON from the model's response
|
||
decoded_output = tokenizer.decode(outputs, skip_special_tokens=True)
|
||
try:
|
||
assistant_marker = 'assistant\n\n'
|
||
assistant_response_start = decoded_output.find(assistant_marker)
|
||
response_part = decoded_output[assistant_response_start + len(assistant_marker):].strip()
|
||
json_start = response_part.find('{')
|
||
json_end = response_part.rfind('}') + 1
|
||
json_string = response_part[json_start:json_end]
|
||
|
||
print("\n--- ✅ MODEL-GENERATED ANALYSIS ---")
|
||
parsed_json = json.loads(json_string)
|
||
print(json.dumps(parsed_json, indent=2))
|
||
except Exception as e:
|
||
print(f"\n--- 🚨 ERROR: Could not parse the model's response. ---")
|
||
print(f"Error: {e}")
|
||
print("\nFull raw output for debugging:")
|
||
print(decoded_output)
|
||
``` |