SiliconMind-V1-Qwen3-8B-GGUF/README.md

---
license: apache-2.0
license_link: https://huggingface.co/AS-SiliconMind/SiliconMind-V1-Qwen3-8B/blob/main/LICENSE
language:
- en
base_model:
- AS-SiliconMind/SiliconMind-V1-Qwen3-8B
pipeline_tag: text-generation
tags:
- verilog
- reasoning
- multi-agent
- gguf
- quantized
- llama.cpp
- ollama
---

# SiliconMind-V1-Qwen3-8B GGUF

GGUF quantizations of [AS-SiliconMind/SiliconMind-V1-Qwen3-8B](https://huggingface.co/AS-SiliconMind/SiliconMind-V1-Qwen3-8B), a 8B model specialized for Verilog code generation, testing, and debugging.

Quantized with [llama.cpp](https://github.com/ggml-org/llama.cpp) b7437, which compatible with Ollama v0.17.4.

## Available Quantizations

| File | Size | Description |
|------|------|-------------|
| SiliconMind-V1-Qwen3-8B-F16.gguf | 25 GB | Full precision (F16) |
| SiliconMind-V1-Qwen3-8B-Q8_0.gguf | 13 GB | 8-bit, highest quality |
| SiliconMind-V1-Qwen3-8B-Q6_K.gguf | 10 GB | 6-bit |
| SiliconMind-V1-Qwen3-8B-Q5_K_M.gguf | 8.8 GB | 5-bit medium |
| SiliconMind-V1-Qwen3-8B-Q4_K_M.gguf | 7.6 GB | 4-bit medium **(recommended)** |
| SiliconMind-V1-Qwen3-8B-Q3_K_L.gguf | 6.7 GB | 3-bit large |
| SiliconMind-V1-Qwen3-8B-Q3_K_M.gguf | 6.3 GB | 3-bit medium |
| SiliconMind-V1-Qwen3-8B-Q3_K_S.gguf | 5.7 GB | 3-bit small |
| SiliconMind-V1-Qwen3-8B-Q2_K.gguf | 5.0 GB | 2-bit, smallest |

## Usage

```bash
ollama run hf.co/thuniverse-ai/SiliconMind-V1-Qwen3-8B-GGUF
```

Example prompt:
```
I would like you to implement a module named TopModule with the following
interface. All input and output ports are one bit unless otherwise
specified.

- input in (3 bits)
- output out (2 bits)

The module should implement a "population count" circuit that counts the
number of '1's in the input vector.
```