Files

ModelHub XC 2feb2eab28 初始化项目，由ModelHub XC社区提供模型

Model: g023/qwen3-tiny-v2-finetuned
Source: Original Platform

2026-05-06 13:43:43 +08:00

1.6 KiB

Raw Permalink Blame History

license, language, base_model, tags, pipeline_tag, library_name, quantized_by

license

language

base_model

Qwen3-g023-tiny-v2-FT-Q8_0 - GRPO Finetuned Q8_0 GGUF Export

https://huggingface.co/g023/qwen3-tiny-v2-finetuned/

Q8_0 GGUF export of a GRPO finetuned Qwen3 model to achieve improved reasoning and reduced repetition. Original SRC Model: https://huggingface.co/g023/qwen3-tiny-v2

THIS IS A WIP (WORK IN PROGRESS)

Files

Qwen3-g023-tiny-v2-FT-Q8_0.gguf: Q8_0 GGUF model (~1.81 GB)
Modelfile: Ollama template + tested default sampling settings
params_best.json: Best sampled parameters from automated sweep
sweep_results.json: Full sweep results and per-test outcomes

Tested Best Parameters (Default in Modelfile)

temperature: 0.65
top_p: 0.9
top_k: 20
min_p: 0.0
repeat_penalty: 1.05
presence_penalty: 0.1
frequency_penalty: 0.1
num_ctx: 40000

Usage (Ollama)

ollama create qwen3-g023-tiny-v2-FT-Q8_0 -f Modelfile
ollama run qwen3-g023-tiny-v2-FT-Q8_0

# thinking on
ollama run qwen3-g023-tiny-v2-FT-Q8_0 --think "Explain why the sky is blue"

# thinking off
ollama run qwen3-g023-tiny-v2-FT-Q8_0 --think=false "Explain why the sky is blue"

or pull from huggingface directly to ollama:

ollama run hf.co/g023/qwen3-tiny-v2-finetuned:Q8_0

Notes

Template is the Qwen3-compatible template with think/no_think handling.
If you want stricter non-thinking behavior, compare alternatives in sweep_results.json.

1.6 KiB Raw Permalink Blame History