pauper-llama3-8b/README.md

---
language:
- en
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
- llama3
- pauper
- mtg
- magic-the-gathering
- fine-tuned
- lora
- gguf
library_name: transformers
---

# Pauper Llama 3 8B

Fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
specialized for Magic: The Gathering's Pauper format using LoRA fine-tuning.

## 📦 Available Formats

This repository contains both the full HuggingFace model and GGUF quantizations for various use cases.

### HuggingFace Transformers (Full Precision)
Perfect for:
- Further fine-tuning
- Maximum quality inference
- Integration with transformers library

### GGUF Quantized Models (llama.cpp compatible)
Perfect for:
- LM Studio, Ollama, llama.cpp
- Local inference on consumer hardware
- Faster inference with minimal quality loss

| File | Size | Description | Best For |
|------|------|-------------|----------|
| `gguf/pauper_llama3_q4km.gguf` | ~5GB | 4-bit quantized | **Recommended** - Best balance |
| `gguf/pauper_llama3_q5km.gguf` | ~6GB | 5-bit quantized | Better quality |
| `gguf/pauper_llama3_q8.gguf` | ~8GB | 8-bit quantized | Near-original quality |
| `gguf/pauper_llama3_fp16.gguf` | ~15GB | Full precision | Maximum quality |

## 🚀 Usage

### Option 1: HuggingFace Transformers

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "nmalinowski/pauper-llama3-8b",
    torch_dtype=torch.float16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("nmalinowski/pauper-llama3-8b")

prompt = "What are the best cards in Pauper?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

### Option 2: LM Studio (GGUF - Easiest!)

1. Download `gguf/pauper_llama3_q4km.gguf` from Files tab
2. Open LM Studio → Load Model
3. Select the downloaded GGUF file
4. Start chatting about Pauper!

### Option 3: llama.cpp

```bash
# Download the quantized model
huggingface-cli download nmalinowski/pauper-llama3-8b gguf/pauper_llama3_q4km.gguf --local-dir ./

# Run inference
./llama-cli -m pauper_llama3_q4km.gguf \
    -p "What are the top Pauper decks in the current meta?" \
    -n 256 \
    --temp 0.7
```

### Option 4: Ollama

```bash
# Create Modelfile
cat > Modelfile <<EOF
FROM ./gguf/pauper_llama3_q4km.gguf
PARAMETER temperature 0.7
PARAMETER top_p 0.9
SYSTEM "You are an expert on Magic: The Gathering's Pauper format."
EOF

# Create and run
ollama create pauper-llama3 -f Modelfile
ollama run pauper-llama3 "Explain the current Pauper meta"
```

## 🎯 Training Details

- **Base Model:** Llama 3 8B Instruct
- **Training Method:** LoRA (Low-Rank Adaptation)
- **Domain:** Magic: The Gathering - Pauper format
- **LoRA Configuration:**
  - Rank: 16
  - Alpha: 32
  - Target modules: q_proj, v_proj
  - Dropout: 0.05

## 💡 Recommendations

- **For most users:** Download `gguf/pauper_llama3_q4km.gguf` and use with LM Studio
- **For best quality:** Use the full HuggingFace model with transformers
- **For low VRAM:** Use Q4_K_M quantization (~5GB)
- **For high VRAM:** Use Q8_0 or FP16 for better quality

## 📊 Performance

The Q4_K_M quantization offers:
- ✅ ~95% of full precision quality
- ✅ 70% smaller file size
- ✅ Faster inference on CPU and GPU
- ✅ Runs on consumer hardware (16GB RAM recommended)

## 🎮 Example Prompts

```
"What are the best removal spells in Pauper?"
"Build me a Pauper deck around Monastery Swiftspear"
"Explain the differences between Affinity and Elves in Pauper"
"What are the current tier 1 Pauper decks?"
```

## ⚠️ Limitations

- Specialized for Pauper format - may not perform well on other MTG formats
- May occasionally hallucinate card names or abilities
- Knowledge cutoff: January 2025
- Not suitable for medical, legal, or financial advice

## 📄 License

This model inherits the Llama 3 Community License from Meta. See [LICENSE](https://llama.meta.com/llama3/license/) for details.

## 🙏 Acknowledgments

- Base model: Meta's Llama 3 8B Instruct
- Training framework: HuggingFace Transformers + PEFT
- Quantization: llama.cpp

## 📞 Issues & Feedback

If you encounter issues or have suggestions, please open an issue on the [Community tab](https://huggingface.co/nmalinowski/pauper-llama3-8b/discussions).