Finetuned from model : unsloth/qwen3-0.6b-unsloth-bnb-4bit
Recommended Settings
> Temperature = 0.1
> top_k = 10
> top_p = 0.95
> min_p = 0.05
> repeat_penalty = 1.0
> Prompt format (for chat) = {input transcript}
> Prompt format (for use in Handy) = ${output}
Note
No System Prompt required.
You need to disable thinking for the model by adding {%- set enable_thinking = false %} in the Jinja Prompt Template.
LMStudio: Go to model gallery, click the model entry, then in inference settings scroll to the bottom to Prompt Template and paste at top.
Available Model files:
Qwen3.5-0.8B.F16.gguf
Qwen3.5-0.8B.Q8_0.gguf
Qwen3.5-0.8B.Q5_K_M.ggu
Qwen3.5-0.8B.Q4_K_M.gguf
Lora merged safetensor
This qwen3 model was trained 2x faster with Unsloth and Huggingface's TRL library.