初始化项目,由ModelHub XC社区提供模型
Model: avinashkongara4/llama3-ragnarok-merged Source: Original Platform
This commit is contained in:
84
README.md
Normal file
84
README.md
Normal file
@@ -0,0 +1,84 @@
|
||||
---
|
||||
library_name: transformers
|
||||
base_model: meta-llama/Llama-3.1-8B-Instruct
|
||||
tags:
|
||||
- llama
|
||||
- rag
|
||||
- question-answering
|
||||
- natural-questions
|
||||
- peft
|
||||
- lora
|
||||
language:
|
||||
- en
|
||||
license: llama3
|
||||
---
|
||||
|
||||
# Llama3 RAGnarok — NQ Fine-tuned
|
||||
|
||||
A fine-tuned version of **Meta Llama 3.1 8B Instruct** for **Retrieval-Augmented Generation (RAG)**,
|
||||
trained on the [Natural Questions (NQ)](https://ai.google.com/research/NaturalQuestions) dataset.
|
||||
|
||||
This is the **merged model** (base + LoRA adapter baked in) — ready to use with no extra dependencies.
|
||||
|
||||
## Model Details
|
||||
|
||||
- **Base model:** meta-llama/Llama-3.1-8B-Instruct
|
||||
- **Fine-tuning method:** LoRA (PEFT)
|
||||
- **Training dataset:** Google Natural Questions (NQ)
|
||||
- **Task:** Extractive QA / RAG
|
||||
- **Developer:** Avinash Kongara
|
||||
|
||||
## How to Use
|
||||
|
||||
```python
|
||||
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||||
import torch
|
||||
|
||||
model_id = "avinashkongara4/llama3-ragnarok-merged"
|
||||
|
||||
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
||||
model = AutoModelForCausalLM.from_pretrained(
|
||||
model_id,
|
||||
torch_dtype=torch.float16,
|
||||
device_map="auto"
|
||||
)
|
||||
|
||||
def ask(question, context):
|
||||
prompt = f"""<|begin_of_text|><|start_header_id|>user<|end_header_id|>
|
||||
Context: {context}
|
||||
Question: {question}<|eot_id|><|start_header_id|>assistant<|end_header_id|>"""
|
||||
|
||||
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
||||
with torch.no_grad():
|
||||
out = model.generate(**inputs, max_new_tokens=200, temperature=0.1)
|
||||
return tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
|
||||
|
||||
# Example
|
||||
context = "The Eiffel Tower is located in Paris, France. It was built in 1889."
|
||||
question = "Where is the Eiffel Tower located?"
|
||||
print(ask(question, context))
|
||||
```
|
||||
|
||||
## Training Details
|
||||
|
||||
- **LoRA rank:** 16
|
||||
- **LoRA alpha:** 32
|
||||
- **Target modules:** q_proj, v_proj, k_proj, o_proj
|
||||
- **Training data:** Natural Questions (NQ) — answerable subset
|
||||
- **Framework:** HuggingFace Transformers + PEFT + TRL
|
||||
|
||||
## Intended Use
|
||||
|
||||
This model is designed for RAG pipelines where a context passage is retrieved
|
||||
and the model answers questions grounded in that context.
|
||||
|
||||
## Limitations
|
||||
|
||||
- Answers are grounded in the provided context — do not expect general knowledge answers without context
|
||||
- Best used as part of a full RAG pipeline with a retriever (e.g., FAISS, Pinecone)
|
||||
- Trained on English only
|
||||
|
||||
## Adapter-only version
|
||||
|
||||
The original LoRA adapter (before merging) is available at:
|
||||
👉 [avinashkongara4/llama3-ragnarok-nq-adapter](https://huggingface.co/avinashkongara4/llama3-ragnarok-nq-adapter)
|
||||
Reference in New Issue
Block a user