commit c99338ad141b6e7d30a815fac26e422435fc4a37 Author: ModelHub XC Date: Wed Apr 22 13:19:58 2026 +0800 初始化项目,由ModelHub XC社区提供模型 Model: daksh-neo/grpo-tax-qwen-3b-gguf Source: Original Platform diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..f480801 --- /dev/null +++ b/.gitattributes @@ -0,0 +1,37 @@ +*.7z filter=lfs diff=lfs merge=lfs -text +*.arrow filter=lfs diff=lfs merge=lfs -text +*.bin filter=lfs diff=lfs merge=lfs -text +*.bz2 filter=lfs diff=lfs merge=lfs -text +*.ckpt filter=lfs diff=lfs merge=lfs -text +*.ftz filter=lfs diff=lfs merge=lfs -text +*.gz filter=lfs diff=lfs merge=lfs -text +*.h5 filter=lfs diff=lfs merge=lfs -text +*.joblib filter=lfs diff=lfs merge=lfs -text +*.lfs.* filter=lfs diff=lfs merge=lfs -text +*.mlmodel filter=lfs diff=lfs merge=lfs -text +*.model filter=lfs diff=lfs merge=lfs -text +*.msgpack filter=lfs diff=lfs merge=lfs -text +*.npy filter=lfs diff=lfs merge=lfs -text +*.npz filter=lfs diff=lfs merge=lfs -text +*.onnx filter=lfs diff=lfs merge=lfs -text +*.ot filter=lfs diff=lfs merge=lfs -text +*.parquet filter=lfs diff=lfs merge=lfs -text +*.pb filter=lfs diff=lfs merge=lfs -text +*.pickle filter=lfs diff=lfs merge=lfs -text +*.pkl filter=lfs diff=lfs merge=lfs -text +*.pt filter=lfs diff=lfs merge=lfs -text +*.pth filter=lfs diff=lfs merge=lfs -text +*.rar filter=lfs diff=lfs merge=lfs -text +*.safetensors filter=lfs diff=lfs merge=lfs -text +saved_model/**/* filter=lfs diff=lfs merge=lfs -text +*.tar.* filter=lfs diff=lfs merge=lfs -text +*.tar filter=lfs diff=lfs merge=lfs -text +*.tflite filter=lfs diff=lfs merge=lfs -text +*.tgz filter=lfs diff=lfs merge=lfs -text +*.wasm filter=lfs diff=lfs merge=lfs -text +*.xz filter=lfs diff=lfs merge=lfs -text +*.zip filter=lfs diff=lfs merge=lfs -text +*.zst filter=lfs diff=lfs merge=lfs -text +*tfevents* filter=lfs diff=lfs merge=lfs -text +grpo-tax-qwen-3b-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text +grpo-tax-qwen-3b-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text diff --git a/README.md b/README.md new file mode 100644 index 0000000..e5bc294 --- /dev/null +++ b/README.md @@ -0,0 +1,122 @@ +--- +license: apache-2.0 +language: +- en +base_model: Qwen/Qwen2.5-3B-Instruct +tags: +- gguf +- qwen2 +- grpo +- tax +- finance +- fine-tuned +pipeline_tag: text-generation +--- + +# grpo-tax-qwen-3b-GGUF + +> Built with [NEO — Your Autonomous AI Agent](https://heyneo.com) + +GGUF quantized versions of **Qwen2.5-3B-Instruct** fine-tuned with **GRPO (Group Relative Policy Optimization)** on tax and financial reasoning tasks. + +## Model Details + +| Property | Value | +|----------|-------| +| Base Model | Qwen/Qwen2.5-3B-Instruct | +| Fine-tuning Method | GRPO (Group Relative Policy Optimization) | +| Domain | Tax & Financial Reasoning | +| Architecture | Qwen2 | +| Context Length | 32,768 tokens | +| Format | GGUF | + +## Available Quantizations + +| File | Quantization | Size | Use Case | +|------|-------------|------|----------| +| `grpo-tax-qwen-3b-Q4_K_M.gguf` | Q4_K_M | ~2.0 GB | Best balance of speed and quality | +| `grpo-tax-qwen-3b-Q8_0.gguf` | Q8_0 | ~3.2 GB | Higher quality, more RAM required | + +## Usage + +### With llama.cpp + +```bash +# Download the model +huggingface-cli download daksh-neo/grpo-tax-qwen-3b-gguf grpo-tax-qwen-3b-Q4_K_M.gguf + +# Run inference +./llama-cli -m grpo-tax-qwen-3b-Q4_K_M.gguf \ + -p "<|im_start|>system\nYou are a tax expert assistant.<|im_end|>\n<|im_start|>user\nWhat is the standard deduction for 2024?<|im_end|>\n<|im_start|>assistant\n" \ + -n 512 --temp 0.7 +``` + +### With Ollama + +```bash +# Create a Modelfile +cat > Modelfile << 'EOF' +FROM ./grpo-tax-qwen-3b-Q4_K_M.gguf +TEMPLATE """<|im_start|>system +{{ .System }}<|im_end|> +<|im_start|>user +{{ .Prompt }}<|im_end|> +<|im_start|>assistant +""" +SYSTEM "You are a helpful tax and financial assistant." +EOF + +ollama create grpo-tax-qwen-3b -f Modelfile +ollama run grpo-tax-qwen-3b +``` + +### With Python (llama-cpp-python) + +```python +from llama_cpp import Llama + +llm = Llama.from_pretrained( + repo_id="daksh-neo/grpo-tax-qwen-3b-gguf", + filename="grpo-tax-qwen-3b-Q4_K_M.gguf", + n_ctx=4096, +) + +response = llm.create_chat_completion( + messages=[ + {"role": "system", "content": "You are a helpful tax assistant."}, + {"role": "user", "content": "Explain what a W-2 form is."} + ] +) +print(response["choices"][0]["message"]["content"]) +``` + +## Training Details + +This model was fine-tuned using GRPO (Group Relative Policy Optimization), a reinforcement learning from human feedback (RLHF) variant that optimizes the model's responses on tax and financial reasoning tasks without requiring a separate reward model. GRPO trains by comparing groups of sampled responses and reinforcing higher-quality answers. + +**Training focus areas:** +- Federal and state tax regulations +- Tax form interpretation (W-2, 1099, Schedule C, etc.) +- Deductions and credits +- Tax planning strategies +- Financial compliance questions + +## Limitations + +- This model is fine-tuned on tax knowledge up to its training cutoff and may not reflect the latest tax law changes. +- Always consult a qualified tax professional for official tax advice. +- The model is not a substitute for professional legal or financial guidance. + +## Related Models + +- [daksh-neo/grpo-tax-qwen-1.5b-gguf](https://huggingface.co/daksh-neo/grpo-tax-qwen-1.5b-gguf) — Smaller 1.5B version for resource-constrained environments + +## License + +Apache 2.0 — see [Qwen2.5 license](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct/blob/main/LICENSE) for base model terms. + +--- + +
+Built with NEO — Your Autonomous AI Agent +
diff --git a/grpo-tax-qwen-3b-Q4_K_M.gguf b/grpo-tax-qwen-3b-Q4_K_M.gguf new file mode 100644 index 0000000..584aaa4 --- /dev/null +++ b/grpo-tax-qwen-3b-Q4_K_M.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3c26043201b285c67a06774bdc28bd222687c96dc3f5d3f9ad4e142fdc82a059 +size 1929902560 diff --git a/grpo-tax-qwen-3b-Q8_0.gguf b/grpo-tax-qwen-3b-Q8_0.gguf new file mode 100644 index 0000000..1325c55 --- /dev/null +++ b/grpo-tax-qwen-3b-Q8_0.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:87a40ceea2c1c04e66ea8440d79cc5a5f0230647cfe9b0c25c2b83f9b8975c62 +size 3285475808