Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support
Model Files
File Name
Size
Quantization
Format
Description
Qwen3_8B.F32.gguf
32.8 GB
FP32
GGUF
Full precision (float32) version
Qwen3_8B.BF16.gguf
16.4 GB
BF16
GGUF
BFloat16 precision version
Qwen3_8B.F16.gguf
16.4 GB
FP16
GGUF
Float16 precision version
Qwen3_8B.Q2_K.gguf
3.28 GB
Q2_K
GGUF
2-bit quantized (K variant)
Qwen3_8B.Q3_K_M.gguf
4.12 GB
Q3_K_M
GGUF
3-bit quantized (K M variant)
Qwen3_8B.Q3_K_S.gguf
3.77 GB
Q3_K_S
GGUF
3-bit quantized (K S variant)
Qwen3_8B.Q4_K_M.gguf
5.03 GB
Q4_K_M
GGUF
4-bit quantized (K M variant)
Qwen3_8B.Q4_K_S.gguf
4.8 GB
Q4_K_S
GGUF
4-bit quantized (K S variant)
Qwen3_8B.Q5_K_M.gguf
5.85 GB
Q5_K_M
GGUF
5-bit quantized (K M variant)
Qwen3_8B.Q8_0.gguf
8.71 GB
Q8_0
GGUF
8-bit quantized
.gitattributes
2.08 kB
—
—
Git LFS tracking file
config.json
31 B
—
—
Configuration placeholder
README.md
31 B
—
—
Model documentation
Quants Usage
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)