Quantized with llama.cpp b7437, which compatible with Ollama v0.17.4.
Available Quantizations
File
Size
Description
SiliconMind-V1-Qwen3-8B-F16.gguf
25 GB
Full precision (F16)
SiliconMind-V1-Qwen3-8B-Q8_0.gguf
13 GB
8-bit, highest quality
SiliconMind-V1-Qwen3-8B-Q6_K.gguf
10 GB
6-bit
SiliconMind-V1-Qwen3-8B-Q5_K_M.gguf
8.8 GB
5-bit medium
SiliconMind-V1-Qwen3-8B-Q4_K_M.gguf
7.6 GB
4-bit medium (recommended)
SiliconMind-V1-Qwen3-8B-Q3_K_L.gguf
6.7 GB
3-bit large
SiliconMind-V1-Qwen3-8B-Q3_K_M.gguf
6.3 GB
3-bit medium
SiliconMind-V1-Qwen3-8B-Q3_K_S.gguf
5.7 GB
3-bit small
SiliconMind-V1-Qwen3-8B-Q2_K.gguf
5.0 GB
2-bit, smallest
Usage
ollama run hf.co/thuniverse-ai/SiliconMind-V1-Qwen3-8B-GGUF
Example prompt:
I would like you to implement a module named TopModule with the following
interface. All input and output ports are one bit unless otherwise
specified.
- input in (3 bits)
- output out (2 bits)
The module should implement a "population count" circuit that counts the
number of '1's in the input vector.