129 lines
6.4 KiB
Markdown
129 lines
6.4 KiB
Markdown
|
|
---
|
|||
|
|
license: apache-2.0
|
|||
|
|
language: en
|
|||
|
|
library_name: transformers
|
|||
|
|
tags:
|
|||
|
|
- text-generation
|
|||
|
|
- json
|
|||
|
|
- qol
|
|||
|
|
- hernia
|
|||
|
|
- healthcare
|
|||
|
|
- llama-3
|
|||
|
|
- fine-tuned
|
|||
|
|
- 8k-context
|
|||
|
|
base_model: meta-llama/Meta-Llama-3-8B-Instruct
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
# Llama-3-8B-Hernia-Analyst-600-Patients-8k
|
|||
|
|
|
|||
|
|
This is a specialized, fine-tuned version of `meta-llama/Meta-Llama-3-8B-Instruct`, designed to function as an expert "AI Research Assistant" for analyzing patient narratives related to Abdominal Wall Hernia (AWH).
|
|||
|
|
|
|||
|
|
This model represents a significant upgrade over previous versions, as it was fine-tuned on a larger dataset of **600 synthetic patients** and trained using the **full 8192 token context window**. This enables it to analyze longer, more complex patient narratives without truncation, resulting in a more accurate and comprehensive analysis.
|
|||
|
|
|
|||
|
|
The model's primary function is to take unstructured, free-text patient stories as input and transform them into a structured, multi-level JSON output. This output adheres to a specific Quality of Life (QoL) framework derived from clinical research, notably the work published in *Hernia (2022) 26:795–808*.
|
|||
|
|
|
|||
|
|
## Model Description
|
|||
|
|
|
|||
|
|
The core objective of this model is to automate and standardize the process of qualitative analysis for patient-reported outcomes. It has been trained to identify and structure information across five key domains:
|
|||
|
|
- Body Image
|
|||
|
|
- Mental Health
|
|||
|
|
- Symptoms and Function
|
|||
|
|
- Interpersonal Relationships
|
|||
|
|
- Employment
|
|||
|
|
|
|||
|
|
The model produces a detailed JSON object that includes an executive summary, a ranked list of the most prominent QoL domains, and a deep-dive analysis for each domain, identifying relevant subthemes and clinical concepts mentioned by the patient.
|
|||
|
|
|
|||
|
|
## Intended Use
|
|||
|
|
|
|||
|
|
This model is intended for **research and prototyping purposes only**. Its primary use case is to process long-form patient narratives (e.g., from detailed interview transcripts or comprehensive questionnaires) and generate a structured, machine-readable analysis. This can be used for large-scale research, data visualization, or to assist clinicians in rapidly understanding the key QoL issues for a patient.
|
|||
|
|
|
|||
|
|
**Disclaimer: This is not a medical device.** The output should not be used for clinical diagnosis, treatment decisions, or any direct patient care without verification and interpretation by a qualified healthcare professional.
|
|||
|
|
|
|||
|
|
## How to Use
|
|||
|
|
|
|||
|
|
The model expects prompts formatted in the Llama 3 Instruct template. The following Python code demonstrates how to load the model and run inference on a new patient narrative, making it a powerful tool for offline analysis.
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# This installs specific, stable versions of the libraries known to work well together
|
|||
|
|
# in the Colab environment.
|
|||
|
|
|
|||
|
|
!pip uninstall -y sentence-transformers
|
|||
|
|
!pip install torch==2.3.1+cu121 torchvision==0.18.1+cu121 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121
|
|||
|
|
!pip install -q "transformers==4.43.2" "datasets==2.18.0" "accelerate==0.29.3" "peft==0.10.0" "bitsandbytes==0.43.1" "trl==0.8.6" "protobuf==3.20.3"
|
|||
|
|
!pip install -q einops scipy sentencepiece tensorboard
|
|||
|
|
|
|||
|
|
# # After installation, we need to restart the runtime one time for the changes to take effect.
|
|||
|
|
# # This is a standard procedure in Colab.
|
|||
|
|
import os
|
|||
|
|
os.kill(os.getpid(), 9)
|
|||
|
|
|
|||
|
|
import torch
|
|||
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
|
|||
|
|
import json
|
|||
|
|
|
|||
|
|
# --- 1. CONFIGURATION ---
|
|||
|
|
# The unique ID of your powerful, 8k-context model on the Hugging Face Hub
|
|||
|
|
model_name = "Laxmikant17/Llama-3-8B-Hernia-Analyst-600-Patients-8k"
|
|||
|
|
|
|||
|
|
# --- 2. LOAD MODEL AND TOKENIZER ---
|
|||
|
|
print(f"Loading fine-tuned model: {model_name}")
|
|||
|
|
|
|||
|
|
# Use 4-bit quantization for efficient inference on consumer GPUs (like in Colab)
|
|||
|
|
bnb_config = BitsAndBytesConfig(
|
|||
|
|
load_in_4bit=True,
|
|||
|
|
bnb_4bit_quant_type="nf4",
|
|||
|
|
bnb_4bit_compute_dtype=torch.bfloat16
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
# Load the model from the Hub with quantization
|
|||
|
|
model = AutoModelForCausalLM.from_pretrained(
|
|||
|
|
model_name,
|
|||
|
|
quantization_config=bnb_config,
|
|||
|
|
device_map="auto" # Automatically use the GPU if available
|
|||
|
|
)
|
|||
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
|||
|
|
model.eval() # Set the model to evaluation mode
|
|||
|
|
|
|||
|
|
print("✅ Model loaded successfully!")
|
|||
|
|
|
|||
|
|
# --- 3. PREPARE YOUR INPUT ---
|
|||
|
|
# This can be a very long and detailed patient narrative
|
|||
|
|
test_narrative = """
|
|||
|
|
The pain is the worst part. It's a constant, burning sensation that gets worse when I stand for more than ten minutes. I can't even lift my grocery bags without feeling a sharp pull. I also feel deformed. I avoid looking at myself without a shirt on. I just want to feel normal again. It's been really tough mentally. I feel a sense of dread every morning when I wake up, just knowing the discomfort is waiting for me. I've become irritable and I'm not pleasant to be around, which is straining my relationship with my family.
|
|||
|
|
"""
|
|||
|
|
|
|||
|
|
# Format the input using the exact Llama 3 Instruct template the model was trained on
|
|||
|
|
instruction = "Your sole function is to be a structured data generator. Analyze the patient narrative and produce a single, valid JSON object as your only output. Adhere strictly to the required format and terminology from the provided knowledge base."
|
|||
|
|
prompt = f"<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\n{instruction}\n\n**Patient Narrative:**\n{test_narrative}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
|
|||
|
|
|
|||
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
|||
|
|
|
|||
|
|
# --- 4. GENERATE ANALYSIS ---
|
|||
|
|
print("\n🚀 Generating analysis...")
|
|||
|
|
with torch.no_grad():
|
|||
|
|
outputs = model.generate(
|
|||
|
|
**inputs,
|
|||
|
|
max_new_tokens=4096, # Give the model plenty of space for its JSON output
|
|||
|
|
do_sample=False,
|
|||
|
|
pad_token_id=tokenizer.eos_token_id
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
# Robustly extract and print the JSON from the model's response
|
|||
|
|
decoded_output = tokenizer.decode(outputs, skip_special_tokens=True)
|
|||
|
|
try:
|
|||
|
|
assistant_marker = 'assistant\n\n'
|
|||
|
|
assistant_response_start = decoded_output.find(assistant_marker)
|
|||
|
|
response_part = decoded_output[assistant_response_start + len(assistant_marker):].strip()
|
|||
|
|
json_start = response_part.find('{')
|
|||
|
|
json_end = response_part.rfind('}') + 1
|
|||
|
|
json_string = response_part[json_start:json_end]
|
|||
|
|
|
|||
|
|
print("\n--- ✅ MODEL-GENERATED ANALYSIS ---")
|
|||
|
|
parsed_json = json.loads(json_string)
|
|||
|
|
print(json.dumps(parsed_json, indent=2))
|
|||
|
|
except Exception as e:
|
|||
|
|
print(f"\n--- 🚨 ERROR: Could not parse the model's response. ---")
|
|||
|
|
print(f"Error: {e}")
|
|||
|
|
print("\nFull raw output for debugging:")
|
|||
|
|
print(decoded_output)
|
|||
|
|
```
|