--- library_name: transformers base_model: meta-llama/Llama-3.1-8B-Instruct tags: - llama - rag - question-answering - natural-questions - peft - lora language: - en license: llama3 --- # Llama3 RAGnarok — NQ Fine-tuned A fine-tuned version of **Meta Llama 3.1 8B Instruct** for **Retrieval-Augmented Generation (RAG)**, trained on the [Natural Questions (NQ)](https://ai.google.com/research/NaturalQuestions) dataset. This is the **merged model** (base + LoRA adapter baked in) — ready to use with no extra dependencies. ## Model Details - **Base model:** meta-llama/Llama-3.1-8B-Instruct - **Fine-tuning method:** LoRA (PEFT) - **Training dataset:** Google Natural Questions (NQ) - **Task:** Extractive QA / RAG - **Developer:** Avinash Kongara ## How to Use ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_id = "avinashkongara4/llama3-ragnarok-merged" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.float16, device_map="auto" ) def ask(question, context): prompt = f"""<|begin_of_text|><|start_header_id|>user<|end_header_id|> Context: {context} Question: {question}<|eot_id|><|start_header_id|>assistant<|end_header_id|>""" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): out = model.generate(**inputs, max_new_tokens=200, temperature=0.1) return tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True) # Example context = "The Eiffel Tower is located in Paris, France. It was built in 1889." question = "Where is the Eiffel Tower located?" print(ask(question, context)) ``` ## Training Details - **LoRA rank:** 16 - **LoRA alpha:** 32 - **Target modules:** q_proj, v_proj, k_proj, o_proj - **Training data:** Natural Questions (NQ) — answerable subset - **Framework:** HuggingFace Transformers + PEFT + TRL ## Intended Use This model is designed for RAG pipelines where a context passage is retrieved and the model answers questions grounded in that context. ## Limitations - Answers are grounded in the provided context — do not expect general knowledge answers without context - Best used as part of a full RAG pipeline with a retriever (e.g., FAISS, Pinecone) - Trained on English only ## Adapter-only version The original LoRA adapter (before merging) is available at: 👉 [avinashkongara4/llama3-ragnarok-nq-adapter](https://huggingface.co/avinashkongara4/llama3-ragnarok-nq-adapter)