Files
phi4_adaptableIE_v2/README.md
ModelHub XC e608a72cb8 初始化项目,由ModelHub XC社区提供模型
Model: FinaPolat/phi4_adaptableIE_v2
Source: Original Platform
2026-05-12 19:09:26 +08:00

135 lines
4.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
license: apache-2.0
base_model: microsoft/phi-4
tags:
- text-generation-inference
- transformers
- unsloth
- phi-4
- information-extraction
- ner
- relation-extraction
- knowledge-graph
- slm
model_creator: FinaPolat
language:
- en
---
# Phi-4-AdaptableIE: Efficient Adaptive Knowledge Graph Extraction
#### This model has gguf version: https://huggingface.co/FinaPolat/phi4_adaptableIE_v2-gguf
Phi-4-AdaptableIE is a specialized **14.7B parameter Small Language Model (SLM)** optimized via **Supervised Fine-Tuning (SFT)** for high-precision, **Joint Named Entity Recognition (NER) and Relation Extraction (RE)**.
Unlike traditional multi-stage pipelines that are prone to cascading error propagation, this model performs entity identification and relational mapping in a single cohesive pass. It is designed to be **ontology-adaptive**, allowing it to conform to dynamic, unseen schemas at inference time through a specialized **Structured Prompt Architecture**.
## 🚀 Model Highlights
- **Joint Extraction:** Unified NER + RE reducing pipeline complexity.
- **Ontology-Adaptive:** Zero-shot adaptation to diverse domains (Astronomy, Music, Healthcare, etc.) via dynamic schema variables.
- **Local & Private:** Optimized for **local CPU-only inference** (via GGUF/Ollama - FinaPolat/phi4_adaptableIE_v2-gguf ), ensuring data sovereignty without external API dependencies.
- **Instruction Aligned:** Fine-tuned to follow strict negative constraints, ensuring zero conversational filler in outputs.
## 🛠 Methodology
The model was fine-tuned using **QLoRA** on the **WebNLG** subset of the **Text2KGBench** benchmark. The training process focused on **Conversational Alignment**, ensuring the model treats extraction as a strict logical mapping:
`Prompt = f(task, schema, example, text)`
---
## 📝 Prompting Strategy
To achieve high-fidelity extraction, the model requires a specific prompt structure.
### 1. System Prompt
```json
{
"role": "system",
"content": "You are a helpful AI assistant specializing in Information Extraction tasks such as Named Entity Recognition and Relation Extraction. Follow the instructions given by the user."
}
```
### 2. User Prompt Template
```css
Information Extraction is the process of automatically identifying and extracting structured information from unstructured text data... [Context] ...
Always extract numbers, dates, and currency values regardless of the specific task.
The task at hand is {task}.
Here is an example of task execution:
{example}
Analyze the text and targets carefully, identify relevant information.
Extract the information in the following format: `{output_format}`.
If no matching entities are found, return an empty list: [].
Please provide only the extracted information without any explanations.
Schema: {schema}
Text: {inputs}
```
### 3. 💻 Usage Examples
Option 1: Transformers (Single GPU)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "FinaPolat/phi4_adaptableIE_v2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", trust_remote_code=True)
task = "Joint NER and RE"
schema = "['CelestialBody', 'apoapsis', 'averageSpeed']"
inputs = "(19255) 1994 VK8 has an average speed of 4.56 km per second."
output_format = "[('subject', 'predicate', 'object')]"
prompt = f"Task: {task}\nSchema: {schema}\nText: {inputs}\nExtract:"
input_ids = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids, max_new_tokens=256, temperature=0.0)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
Option 2: High-Throughput Batch Inference (vLLM)
```python
from vllm import LLM, SamplingParams
llm = LLM(
model="FinaPolat/phi4_adaptableIE_v2",
dtype="bfloat16",
trust_remote_code=True,
gpu_memory_utilization=0.9,
max_model_len=3000,
enforce_eager=True,
distributed_executor_backend="uni"
)
sampling_params = SamplingParams(temperature=0.0, max_tokens=256)
outputs = llm.chat(batch_prompts, sampling_params=sampling_params, use_tqdm=True)
```
### 4. 📦 Deployment & Hardware Requirements
| Deployment Mode | Quantization | Hardware Requirement | Target Latency |
|-----------------|--------------|------------------------------------------|----------------|
| Server-side | BF16 | 1× NVIDIA A100 / RTX 4090 (24GB+) | Ultra-Low |
| Local Consumer | 4-bit GGUF | 16GB RAM (Apple Silicon / PC CPU) | Moderate |
For CPU-only local execution, refer to the GGUF version: phi4_adaptableIE_v2-gguf📜
### 5. Citation & Credits
If you use this model in your research, please cite the Text2KGBench framework and the Microsoft Phi-4 technical report and our work:
https://github.com/FinaPolat/ENEXA_adaptable_extraction
Video: https://www.youtube.com/watch?v=your-video-