40 lines
1.6 KiB
Markdown
40 lines
1.6 KiB
Markdown
---
|
|
base_model: unsloth/Qwen3-4B-Thinking-2507
|
|
tags:
|
|
- text-generation-inference
|
|
- transformers
|
|
- unsloth
|
|
- qwen3
|
|
language:
|
|
- en
|
|
datasets:
|
|
- TeichAI/MiniMax-M2.1-Code-SFT
|
|
---
|
|
|
|
# Qwen3 4B Thinking x MiniMax M2.1 Code SFT
|
|
|
|
This model was trained on over 1,300 agentic "vibe coding" examples generated by MiniMax M2.1 with a large majority focused on extracting UI/UX design capabilities across different tech stacks.
|
|
|
|
For more info on how and what the model was trained on, please view [the dataset card](https://huggingface.co/datasets/TeichAI/MiniMax-M2.1-Code-SFT)
|
|
|
|
## How to run
|
|
|
|
Personally I use vllm on windows via wsl. Here is my command:
|
|
|
|
```
|
|
vllm serve TeichAI/Qwen3-4B-Thinking-MiniMax-M2.1-Code-Distill --reasoning-parser deepseek_r1 --enable-auto-tool-choice --tool-call-parser hermes --max-model-len 65536 --quantization bitsandbytes --override-generation-config '{"temperature": 0.6, "top_p": 0.95, "top_k": 20}'
|
|
```
|
|
|
|
## Demo
|
|
|
|

|
|
|
|
Prompt: `Make me a landing page for my bakery. we make cakes, cookies, brownies and everything else a normal bakery makes. I want it to look really nice`
|
|
|
|
Needless to say I was impressed what this 4B model was capable of (especially at 4bit quant)
|
|
|
|
The site is overall very choppy and needs polishing, but please feel free to download the html file [(in the repo)](https://huggingface.co/TeichAI/Qwen3-4B-Thinking-MiniMax-M2.1-Code-Distill/tree/main/demo) and give it a look with all it's wonky animations :)
|
|
|
|
|
|
---
|
|
This qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. |