Go to file

ModelHub XC c743d1a865 初始化项目，由ModelHub XC社区提供模型

Model: daksh-neo/grpo-tax-qwen-1.5b-gguf
Source: Original Platform

2026-04-19 15:48:29 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-04-19 15:48:29 +08:00

grpo-tax-qwen-1.5b-Q4_K_M.gguf

初始化项目，由ModelHub XC社区提供模型

2026-04-19 15:48:29 +08:00

grpo-tax-qwen-1.5b-Q8_0.gguf

初始化项目，由ModelHub XC社区提供模型

2026-04-19 15:48:29 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-04-19 15:48:29 +08:00

README.md

license, language, base_model, tags, pipeline_tag

license

language

base_model

grpo-tax-qwen-1.5b-GGUF

Built with NEO — Your Autonomous AI Agent

GGUF quantized versions of Qwen2.5-1.5B-Instruct fine-tuned with GRPO (Group Relative Policy Optimization) on tax and financial reasoning tasks.

Model Details

Property	Value
Base Model	Qwen/Qwen2.5-1.5B-Instruct
Fine-tuning Method	GRPO (Group Relative Policy Optimization)
Domain	Tax & Financial Reasoning
Architecture	Qwen2
Context Length	32,768 tokens
Format	GGUF

Available Quantizations

File	Quantization	Size	Use Case
`grpo-tax-qwen-1.5b-Q4_K_M.gguf`	Q4_K_M	~1.0 GB	Best balance of speed and quality
`grpo-tax-qwen-1.5b-Q8_0.gguf`	Q8_0	~1.6 GB	Higher quality, more RAM required

Usage

With llama.cpp

# Download the model
huggingface-cli download daksh-neo/grpo-tax-qwen-1.5b-gguf grpo-tax-qwen-1.5b-Q4_K_M.gguf

# Run inference
./llama-cli -m grpo-tax-qwen-1.5b-Q4_K_M.gguf \
  -p "<|im_start|>system\nYou are a tax expert assistant.<|im_end|>\n<|im_start|>user\nWhat is the standard deduction for 2024?<|im_end|>\n<|im_start|>assistant\n" \
  -n 512 --temp 0.7

With Ollama

# Create a Modelfile
cat > Modelfile << 'EOF'
FROM ./grpo-tax-qwen-1.5b-Q4_K_M.gguf
TEMPLATE """<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""
SYSTEM "You are a helpful tax and financial assistant."
EOF

ollama create grpo-tax-qwen-1.5b -f Modelfile
ollama run grpo-tax-qwen-1.5b

With Python (llama-cpp-python)

from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="daksh-neo/grpo-tax-qwen-1.5b-gguf",
    filename="grpo-tax-qwen-1.5b-Q4_K_M.gguf",
    n_ctx=4096,
)

response = llm.create_chat_completion(
    messages=[
        {"role": "system", "content": "You are a helpful tax assistant."},
        {"role": "user", "content": "Explain what a W-2 form is."}
    ]
)
print(response["choices"][0]["message"]["content"])

Training Details

This model was fine-tuned using GRPO (Group Relative Policy Optimization), a reinforcement learning from human feedback (RLHF) variant that optimizes the model's responses on tax and financial reasoning tasks without requiring a separate reward model. GRPO trains by comparing groups of sampled responses and reinforcing higher-quality answers.

Training focus areas:

Federal and state tax regulations
Tax form interpretation (W-2, 1099, Schedule C, etc.)
Deductions and credits
Tax planning strategies
Financial compliance questions

Limitations

This model is fine-tuned on tax knowledge up to its training cutoff and may not reflect the latest tax law changes.
Always consult a qualified tax professional for official tax advice.
The model is not a substitute for professional legal or financial guidance.

daksh-neo/grpo-tax-qwen-3b-gguf — Larger 3B version with higher accuracy

License

Apache 2.0 — see Qwen2.5 license for base model terms.

Built with NEO — Your Autonomous AI Agent

README.md

grpo-tax-qwen-1.5b-GGUF

Model Details

Available Quantizations

Usage

With llama.cpp

With Ollama

With Python (llama-cpp-python)

Training Details

Limitations

Related Models

License