初始化项目,由ModelHub XC社区提供模型
Model: daksh-neo/grpo-tax-qwen-3b-gguf Source: Original Platform
This commit is contained in:
37
.gitattributes
vendored
Normal file
37
.gitattributes
vendored
Normal file
@@ -0,0 +1,37 @@
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||
*.model filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
grpo-tax-qwen-3b-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
grpo-tax-qwen-3b-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
122
README.md
Normal file
122
README.md
Normal file
@@ -0,0 +1,122 @@
|
||||
---
|
||||
license: apache-2.0
|
||||
language:
|
||||
- en
|
||||
base_model: Qwen/Qwen2.5-3B-Instruct
|
||||
tags:
|
||||
- gguf
|
||||
- qwen2
|
||||
- grpo
|
||||
- tax
|
||||
- finance
|
||||
- fine-tuned
|
||||
pipeline_tag: text-generation
|
||||
---
|
||||
|
||||
# grpo-tax-qwen-3b-GGUF
|
||||
|
||||
> Built with [NEO — Your Autonomous AI Agent](https://heyneo.com)
|
||||
|
||||
GGUF quantized versions of **Qwen2.5-3B-Instruct** fine-tuned with **GRPO (Group Relative Policy Optimization)** on tax and financial reasoning tasks.
|
||||
|
||||
## Model Details
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| Base Model | Qwen/Qwen2.5-3B-Instruct |
|
||||
| Fine-tuning Method | GRPO (Group Relative Policy Optimization) |
|
||||
| Domain | Tax & Financial Reasoning |
|
||||
| Architecture | Qwen2 |
|
||||
| Context Length | 32,768 tokens |
|
||||
| Format | GGUF |
|
||||
|
||||
## Available Quantizations
|
||||
|
||||
| File | Quantization | Size | Use Case |
|
||||
|------|-------------|------|----------|
|
||||
| `grpo-tax-qwen-3b-Q4_K_M.gguf` | Q4_K_M | ~2.0 GB | Best balance of speed and quality |
|
||||
| `grpo-tax-qwen-3b-Q8_0.gguf` | Q8_0 | ~3.2 GB | Higher quality, more RAM required |
|
||||
|
||||
## Usage
|
||||
|
||||
### With llama.cpp
|
||||
|
||||
```bash
|
||||
# Download the model
|
||||
huggingface-cli download daksh-neo/grpo-tax-qwen-3b-gguf grpo-tax-qwen-3b-Q4_K_M.gguf
|
||||
|
||||
# Run inference
|
||||
./llama-cli -m grpo-tax-qwen-3b-Q4_K_M.gguf \
|
||||
-p "<|im_start|>system\nYou are a tax expert assistant.<|im_end|>\n<|im_start|>user\nWhat is the standard deduction for 2024?<|im_end|>\n<|im_start|>assistant\n" \
|
||||
-n 512 --temp 0.7
|
||||
```
|
||||
|
||||
### With Ollama
|
||||
|
||||
```bash
|
||||
# Create a Modelfile
|
||||
cat > Modelfile << 'EOF'
|
||||
FROM ./grpo-tax-qwen-3b-Q4_K_M.gguf
|
||||
TEMPLATE """<|im_start|>system
|
||||
{{ .System }}<|im_end|>
|
||||
<|im_start|>user
|
||||
{{ .Prompt }}<|im_end|>
|
||||
<|im_start|>assistant
|
||||
"""
|
||||
SYSTEM "You are a helpful tax and financial assistant."
|
||||
EOF
|
||||
|
||||
ollama create grpo-tax-qwen-3b -f Modelfile
|
||||
ollama run grpo-tax-qwen-3b
|
||||
```
|
||||
|
||||
### With Python (llama-cpp-python)
|
||||
|
||||
```python
|
||||
from llama_cpp import Llama
|
||||
|
||||
llm = Llama.from_pretrained(
|
||||
repo_id="daksh-neo/grpo-tax-qwen-3b-gguf",
|
||||
filename="grpo-tax-qwen-3b-Q4_K_M.gguf",
|
||||
n_ctx=4096,
|
||||
)
|
||||
|
||||
response = llm.create_chat_completion(
|
||||
messages=[
|
||||
{"role": "system", "content": "You are a helpful tax assistant."},
|
||||
{"role": "user", "content": "Explain what a W-2 form is."}
|
||||
]
|
||||
)
|
||||
print(response["choices"][0]["message"]["content"])
|
||||
```
|
||||
|
||||
## Training Details
|
||||
|
||||
This model was fine-tuned using GRPO (Group Relative Policy Optimization), a reinforcement learning from human feedback (RLHF) variant that optimizes the model's responses on tax and financial reasoning tasks without requiring a separate reward model. GRPO trains by comparing groups of sampled responses and reinforcing higher-quality answers.
|
||||
|
||||
**Training focus areas:**
|
||||
- Federal and state tax regulations
|
||||
- Tax form interpretation (W-2, 1099, Schedule C, etc.)
|
||||
- Deductions and credits
|
||||
- Tax planning strategies
|
||||
- Financial compliance questions
|
||||
|
||||
## Limitations
|
||||
|
||||
- This model is fine-tuned on tax knowledge up to its training cutoff and may not reflect the latest tax law changes.
|
||||
- Always consult a qualified tax professional for official tax advice.
|
||||
- The model is not a substitute for professional legal or financial guidance.
|
||||
|
||||
## Related Models
|
||||
|
||||
- [daksh-neo/grpo-tax-qwen-1.5b-gguf](https://huggingface.co/daksh-neo/grpo-tax-qwen-1.5b-gguf) — Smaller 1.5B version for resource-constrained environments
|
||||
|
||||
## License
|
||||
|
||||
Apache 2.0 — see [Qwen2.5 license](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct/blob/main/LICENSE) for base model terms.
|
||||
|
||||
---
|
||||
|
||||
<div align="center">
|
||||
Built with <a href="https://heyneo.com">NEO</a> — Your Autonomous AI Agent
|
||||
</div>
|
||||
3
grpo-tax-qwen-3b-Q4_K_M.gguf
Normal file
3
grpo-tax-qwen-3b-Q4_K_M.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:3c26043201b285c67a06774bdc28bd222687c96dc3f5d3f9ad4e142fdc82a059
|
||||
size 1929902560
|
||||
3
grpo-tax-qwen-3b-Q8_0.gguf
Normal file
3
grpo-tax-qwen-3b-Q8_0.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:87a40ceea2c1c04e66ea8440d79cc5a5f0230647cfe9b0c25c2b83f9b8975c62
|
||||
size 3285475808
|
||||
Reference in New Issue
Block a user