Files
nehme-flashcheck-1b/README.md
ModelHub XC 1a9f59fef6 初始化项目,由ModelHub XC社区提供模型
Model: nehmeailabs-org/nehme-flashcheck-1b
Source: Original Platform
2026-05-31 13:58:19 +08:00

131 lines
3.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
license: gemma
language:
- en
pipeline_tag: text-generation
tags:
- fact-checking
- hallucination-detection
- rag
- compliance
- guardrails
- nli
- gemma
base_model: unsloth/gemma-3-1b-it-unsloth-bnb-4bit
---
# FlashCheck-1B: The Enterprise Logic Engine
## Model Description
**FlashCheck-1B** is a Gemma 3 (1B) fine-tune specialized for **Contextual Policy Adherence** and **Hallucination Detection**.
It is designed to act as a fast verifier in RAG pipelines: given a **Document** and a **Claim**, it answers **"Yes"** if the claim is fully supported by the document, otherwise **"No"**.
- **Developer:** Nehme AI Labs
- **Training Base:** `unsloth/gemma-3-1b-it-unsloth-bnb-4bit` (Gemma family)
- **License/Terms:** Gemma (see Gemma terms associated with the base model)
## Whats in this repo
- **Transformers (standalone):** `config.json` + `model.safetensors` + tokenizer files
- **GGUF (local inference):** `nehme-flashcheck-1b.Q8_0.gguf` (or in `gguf/` if you placed it there)
## Intended behavior
- Input: **Document** (premise) + **Claim** (hypothesis)
- Output: **"Yes"** or **"No"** (short, deterministic; use greedy decoding)
## Usage
### 1) Python (Transformers)
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
MODEL_ID = "nehmeailabs-org/nehme-flashcheck-1b"
SYSTEM_MESSAGE = (
"You are a fact checking model developed by NehmeAILabs. Determine whether the provided claim is consistent with "
"the corresponding document. Consistency in this context implies that all information presented in the claim is "
"substantiated by the document. If not, it should be considered inconsistent. Please assess the claim's consistency "
"with the document by responding with either \"Yes\" or \"No\"."
)
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
device_map="auto",
torch_dtype="auto",
)
model.eval()
document = "The user must not share API keys."
claim = "The user message 'Here is the staging key sk-123' violates the policy."
user_prompt = f"Document: {document}\n\nClaim: {claim}"
messages = [
{"role": "system", "content": SYSTEM_MESSAGE},
{"role": "user", "content": user_prompt},
]
try:
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt",
)
except Exception:
plain = f"{SYSTEM_MESSAGE}\n\n{user_prompt}"
input_ids = tokenizer(plain, return_tensors="pt").input_ids
input_ids = input_ids.to(model.device)
with torch.no_grad():
out = model.generate(
input_ids=input_ids,
max_new_tokens=8,
do_sample=False,
temperature=0.0,
top_p=1.0,
)
gen_ids = out[0, input_ids.shape[-1]:]
verdict = tokenizer.decode(gen_ids, skip_special_tokens=True).strip()
print(verdict) # Expected: "Yes" or "No"
```
### 2) Local (GGUF / llama.cpp)
If the GGUF file is at repo root:
```bash
./main -m nehme-flashcheck-1b.Q8_0.gguf -p "Document: ...\n\nClaim: ..."
```
If you placed it in a `gguf/` folder:
```bash
./main -m gguf/nehme-flashcheck-1b.Q8_0.gguf -p "Document: ...\n\nClaim: ..."
```
## Notes
- For best results, keep the prompt format stable (`Document:` then `Claim:`) and use deterministic decoding.
- This model is optimized for verification/consistency checks, not general open-ended chat.
## Citation
```bibtex
@misc{nehme2025flashcheck,
title={FlashCheck: Efficient Logic Distillation for RAG Compliance},
author={NehmeAILabs},
year={2025},
publisher={Nehme AI Labs},
howpublished={\url{https://nehmeailabs.com}}
}
```