初始化项目，由ModelHub XC社区提供模型

Model: YiPz/llama3-8b-pokerbench-sft Source: Original Platform
2026-04-23 23:54:03 +08:00
commit 4938b2dffe
12 changed files with 2648 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,62 @@
+---
+license: llama3
+base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
+tags:
+- poker
+- game-theory
+- fine-tuned
+- sft
+datasets:
+- RZ412/PokerBench
+language:
+- en
+pipeline_tag: text-generation
+---
+
+# Llama 3 8B - PokerBench SFT
+
+Fine-tuned Llama 3.1 8B Instruct for poker decision-making using LoRA, trained on PokerBench dataset.
+
+## Training Details
+
+- **Base Model**: Meta-Llama-3.1-8B-Instruct
+- **Training Data**: PokerBench (RZ412/PokerBench)
+- **Method**: LoRA fine-tuning (merged)
+- **Training Steps**: 5,000
+- **Batch Size**: 128
+- **Learning Rate**: 1e-6
+
+## Usage
+
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+model = AutoModelForCausalLM.from_pretrained("YiPz/llama3-8b-pokerbench-sft")
+tokenizer = AutoTokenizer.from_pretrained("YiPz/llama3-8b-pokerbench-sft")
+
+messages = [
+    {"role": "system", "content": "You are an expert poker player. Respond with your action in <action></action> tags."},
+    {"role": "user", "content": "Your poker scenario..."}
+]
+
+inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
+outputs = model.generate(inputs, max_new_tokens=32, temperature=0.1)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+
+## Output Format
+
+Actions are returned in `<action></action>` tags:
+- `<action>fold</action>`
+- `<action>call</action>`
+- `<action>check</action>`
+- `<action>raise 15</action>`
+- `<action>bet 10</action>`
+
+## GGUF Versions
+
+Quantized GGUF versions for llama.cpp/Ollama: [YiPz/llama3-8b-pokerbench-sft-gguf](https://huggingface.co/YiPz/llama3-8b-pokerbench-sft-gguf)
+
+## License
+
+Subject to Llama 3 license.