初始化项目,由ModelHub XC社区提供模型
Model: nmalinowski/pauper-llama3-8b Source: Original Platform
This commit is contained in:
156
README.md
Normal file
156
README.md
Normal file
@@ -0,0 +1,156 @@
|
||||
---
|
||||
language:
|
||||
- en
|
||||
license: llama3
|
||||
base_model: meta-llama/Meta-Llama-3-8B-Instruct
|
||||
tags:
|
||||
- llama3
|
||||
- pauper
|
||||
- mtg
|
||||
- magic-the-gathering
|
||||
- fine-tuned
|
||||
- lora
|
||||
- gguf
|
||||
library_name: transformers
|
||||
---
|
||||
|
||||
# Pauper Llama 3 8B
|
||||
|
||||
Fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
|
||||
specialized for Magic: The Gathering's Pauper format using LoRA fine-tuning.
|
||||
|
||||
## 📦 Available Formats
|
||||
|
||||
This repository contains both the full HuggingFace model and GGUF quantizations for various use cases.
|
||||
|
||||
### HuggingFace Transformers (Full Precision)
|
||||
Perfect for:
|
||||
- Further fine-tuning
|
||||
- Maximum quality inference
|
||||
- Integration with transformers library
|
||||
|
||||
### GGUF Quantized Models (llama.cpp compatible)
|
||||
Perfect for:
|
||||
- LM Studio, Ollama, llama.cpp
|
||||
- Local inference on consumer hardware
|
||||
- Faster inference with minimal quality loss
|
||||
|
||||
| File | Size | Description | Best For |
|
||||
|------|------|-------------|----------|
|
||||
| `gguf/pauper_llama3_q4km.gguf` | ~5GB | 4-bit quantized | **Recommended** - Best balance |
|
||||
| `gguf/pauper_llama3_q5km.gguf` | ~6GB | 5-bit quantized | Better quality |
|
||||
| `gguf/pauper_llama3_q8.gguf` | ~8GB | 8-bit quantized | Near-original quality |
|
||||
| `gguf/pauper_llama3_fp16.gguf` | ~15GB | Full precision | Maximum quality |
|
||||
|
||||
## 🚀 Usage
|
||||
|
||||
### Option 1: HuggingFace Transformers
|
||||
|
||||
```python
|
||||
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||||
import torch
|
||||
|
||||
model = AutoModelForCausalLM.from_pretrained(
|
||||
"nmalinowski/pauper-llama3-8b",
|
||||
torch_dtype=torch.float16,
|
||||
device_map="auto"
|
||||
)
|
||||
tokenizer = AutoTokenizer.from_pretrained("nmalinowski/pauper-llama3-8b")
|
||||
|
||||
prompt = "What are the best cards in Pauper?"
|
||||
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
||||
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
|
||||
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
||||
```
|
||||
|
||||
### Option 2: LM Studio (GGUF - Easiest!)
|
||||
|
||||
1. Download `gguf/pauper_llama3_q4km.gguf` from Files tab
|
||||
2. Open LM Studio → Load Model
|
||||
3. Select the downloaded GGUF file
|
||||
4. Start chatting about Pauper!
|
||||
|
||||
### Option 3: llama.cpp
|
||||
|
||||
```bash
|
||||
# Download the quantized model
|
||||
huggingface-cli download nmalinowski/pauper-llama3-8b gguf/pauper_llama3_q4km.gguf --local-dir ./
|
||||
|
||||
# Run inference
|
||||
./llama-cli -m pauper_llama3_q4km.gguf \
|
||||
-p "What are the top Pauper decks in the current meta?" \
|
||||
-n 256 \
|
||||
--temp 0.7
|
||||
```
|
||||
|
||||
### Option 4: Ollama
|
||||
|
||||
```bash
|
||||
# Create Modelfile
|
||||
cat > Modelfile <<EOF
|
||||
FROM ./gguf/pauper_llama3_q4km.gguf
|
||||
PARAMETER temperature 0.7
|
||||
PARAMETER top_p 0.9
|
||||
SYSTEM "You are an expert on Magic: The Gathering's Pauper format."
|
||||
EOF
|
||||
|
||||
# Create and run
|
||||
ollama create pauper-llama3 -f Modelfile
|
||||
ollama run pauper-llama3 "Explain the current Pauper meta"
|
||||
```
|
||||
|
||||
## 🎯 Training Details
|
||||
|
||||
- **Base Model:** Llama 3 8B Instruct
|
||||
- **Training Method:** LoRA (Low-Rank Adaptation)
|
||||
- **Domain:** Magic: The Gathering - Pauper format
|
||||
- **LoRA Configuration:**
|
||||
- Rank: 16
|
||||
- Alpha: 32
|
||||
- Target modules: q_proj, v_proj
|
||||
- Dropout: 0.05
|
||||
|
||||
## 💡 Recommendations
|
||||
|
||||
- **For most users:** Download `gguf/pauper_llama3_q4km.gguf` and use with LM Studio
|
||||
- **For best quality:** Use the full HuggingFace model with transformers
|
||||
- **For low VRAM:** Use Q4_K_M quantization (~5GB)
|
||||
- **For high VRAM:** Use Q8_0 or FP16 for better quality
|
||||
|
||||
## 📊 Performance
|
||||
|
||||
The Q4_K_M quantization offers:
|
||||
- ✅ ~95% of full precision quality
|
||||
- ✅ 70% smaller file size
|
||||
- ✅ Faster inference on CPU and GPU
|
||||
- ✅ Runs on consumer hardware (16GB RAM recommended)
|
||||
|
||||
## 🎮 Example Prompts
|
||||
|
||||
```
|
||||
"What are the best removal spells in Pauper?"
|
||||
"Build me a Pauper deck around Monastery Swiftspear"
|
||||
"Explain the differences between Affinity and Elves in Pauper"
|
||||
"What are the current tier 1 Pauper decks?"
|
||||
```
|
||||
|
||||
## ⚠️ Limitations
|
||||
|
||||
- Specialized for Pauper format - may not perform well on other MTG formats
|
||||
- May occasionally hallucinate card names or abilities
|
||||
- Knowledge cutoff: January 2025
|
||||
- Not suitable for medical, legal, or financial advice
|
||||
|
||||
## 📄 License
|
||||
|
||||
This model inherits the Llama 3 Community License from Meta. See [LICENSE](https://llama.meta.com/llama3/license/) for details.
|
||||
|
||||
## 🙏 Acknowledgments
|
||||
|
||||
- Base model: Meta's Llama 3 8B Instruct
|
||||
- Training framework: HuggingFace Transformers + PEFT
|
||||
- Quantization: llama.cpp
|
||||
|
||||
## 📞 Issues & Feedback
|
||||
|
||||
If you encounter issues or have suggestions, please open an issue on the [Community tab](https://huggingface.co/nmalinowski/pauper-llama3-8b/discussions).
|
||||
Reference in New Issue
Block a user