Go to file

ModelHub XC d715d84264 初始化项目，由ModelHub XC社区提供模型

Model: avinashkongara4/llama3-ragnarok-merged
Source: Original Platform

2026-05-22 10:36:15 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-22 10:36:15 +08:00

chat_template.jinja

初始化项目，由ModelHub XC社区提供模型

2026-05-22 10:36:15 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-22 10:36:15 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-22 10:36:15 +08:00

model.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-22 10:36:15 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-22 10:36:15 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-22 10:36:15 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-05-22 10:36:15 +08:00

README.md

library_name, base_model, tags, language, license

library_name

base_model

Llama3 RAGnarok — NQ Fine-tuned

A fine-tuned version of Meta Llama 3.1 8B Instruct for Retrieval-Augmented Generation (RAG), trained on the Natural Questions (NQ) dataset.

This is the merged model (base + LoRA adapter baked in) — ready to use with no extra dependencies.

Model Details

Base model: meta-llama/Llama-3.1-8B-Instruct
Fine-tuning method: LoRA (PEFT)
Training dataset: Google Natural Questions (NQ)
Task: Extractive QA / RAG
Developer: Avinash Kongara

How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "avinashkongara4/llama3-ragnarok-merged"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto"
)

def ask(question, context):
    prompt = f"""<|begin_of_text|><|start_header_id|>user<|end_header_id|>
Context: {context}
Question: {question}<|eot_id|><|start_header_id|>assistant<|end_header_id|>"""

    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    with torch.no_grad():
        out = model.generate(**inputs, max_new_tokens=200, temperature=0.1)
    return tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)

# Example
context = "The Eiffel Tower is located in Paris, France. It was built in 1889."
question = "Where is the Eiffel Tower located?"
print(ask(question, context))

Training Details

LoRA rank: 16
LoRA alpha: 32
Target modules: q_proj, v_proj, k_proj, o_proj
Training data: Natural Questions (NQ) — answerable subset
Framework: HuggingFace Transformers + PEFT + TRL

Intended Use

This model is designed for RAG pipelines where a context passage is retrieved and the model answers questions grounded in that context.

Limitations

Answers are grounded in the provided context — do not expect general knowledge answers without context
Best used as part of a full RAG pipeline with a retriever (e.g., FAISS, Pinecone)
Trained on English only

Adapter-only version

The original LoRA adapter (before merging) is available at:
👉 avinashkongara4/llama3-ragnarok-nq-adapter