SiliconMind-V1-Qwen2.5-C-7B-I/README.md

---
license: apache-2.0
license_link: https://huggingface.co/AS-SiliconMind/SiliconMind-V1-Qwen2.5-C-7B-I/blob/main/LICENSE
language:
- en
base_model:
- Qwen/Qwen2.5-Coder-7B-Instruct
pipeline_tag: text-generation
tags:
- verilog
- reasoning
- multi-agent
---

<p align="center">
  <img alt="SiliconMind Logo" src="https://raw.githubusercontent.com/AS-SiliconMind/SiliconMind-V1/refs/heads/gh-pages/images/logo.webp"/>
</p>

# SiliconMind-V1: Multi-Agent Distillation and Debug-Reasoning Workflows for Verilog Code Generation

## Model Overview
**SiliconMind-V1** is a family of open-source Large Language Models (LLMs) specialized for Verilog code generation, testing, and debugging. Unlike previous approaches that rely heavily on commercial models or external EDA tools, SiliconMind-V1 is locally fine-tuned to iteratively **generate**, **test**, and **debug** RTL designs through test-time scaling.

The **SiliconMind-V1** models are enabled by a unified multi-agent framework for reasoning-oriented training data generation with integrated testbench-driven verification to achieve state-of-the-art functional correctness on major benchmarks.

**Key Features:**
* **Reasoning-Oriented:** Trained to "think" before coding, producing reasoning traces that guide functional correctness.
* **Self-Testing & Debugging:** Capable of generating its own test report to fix bugs without tool-calling.
* **Tool-Free Verification:** Reduces reliance on expensive, proprietary EDA software during the generation loop.
* **Multi-Strategy Inference:** Supports Regular, Deep Thinking, and Agentic inference modes for scalable performance.

## Model Variants
We provide SiliconMind-V1 variants fine-tuned from the following base models:

| Model Name | Base Model | Size |
|:---|:---|:---|
| [**SiliconMind-V1-Qwen2.5-C-7B-I**](https://huggingface.co/AS-SiliconMind/SiliconMind-V1-Qwen2.5-C-7B-I) | Qwen2.5-Coder-7B-Instruct | 7B |
| [**SiliconMind-V1-Qwen3-4B-T-2507**](https://huggingface.co/AS-SiliconMind/SiliconMind-V1-Olmo-3-7B-Think) | Qwen3-4B-Thinking-2507 | 4B |
| [**SiliconMind-V1-Qwen3-8B**](https://huggingface.co/AS-SiliconMind/SiliconMind-V1-Qwen3-8B) | Qwen3-8B | 8B |
| [**SiliconMind-V1-Olmo-3-7B-Think**](https://huggingface.co/AS-SiliconMind/SiliconMind-V1-Qwen3-4B-T-2507) | Olmo-3-7B-Think | 7B |

### Model Sources

- **Project Page:** https://AS-SiliconMind.github.io/SiliconMind-V1
- **Repositories:**
    - Inference Engine: https://github.com/AS-SiliconMind/SiliconMind-V1
- **Paper:** arxiv

## Usage & Inference Strategies

SiliconMind-V1 is designed to work with three distinct inference strategies, allowing users to trade off between latency/cost and accuracy. Please refer to our [inference engine](https://github.com/AS-SiliconMind/SiliconMind-V1) for more details on how to get started with **SiliconMind-V1**.

### 1. Regular Strategy
The model acts as a standard code generator but is prompted to produce a reasoning trace before the final code.
* **Best for:** Quick prototyping and simple modules.

### 2. Deep Thinking Strategy
Explicit instructions are given to the model to solve the problem by:
1.  Drafting an initial solution.
2.  Mentally "testing" it against scenarios.
3.  Self-debugging within the reasoning trace.
* **Best for:** Complex logic where single-pass generation often fails.

### 3. Agentic Strategy (Recommended for SOTA Results)
A multi-turn workflow where the model plays different "Agent" roles sequentially:
1.  **Solution Agent:** Generates initial code + reasoning.
2.  **Test Agent:** Generates a test report for the code.
3.  **Debug Agent:** Reviews the test report and fixes errors.
* **Performance:** Achieves the highest pass rates (Pass@1) by allowing iterative refinement (up to 3 interactions recommended).

## Training
The models were trained on a Multi-Faceted Dataset constructed via a custom two-phase pipeline:

* **Code Generation Phase:** A multi-agent system (Revision, Solution, Testbench, Verification Agents) synthesized 36k functionally verified (problem, reasoning, code, testbench) tuples from public sources.

* **Self-Correction Phase:** The model was stress-tested against these problems. Hard samples (where the model failed) were augmented with "Test" and "Debug" curriculum, teaching the model how to write test reports and fix its own errors.

## Evaluation: Pass@1 Performance (%) Across Major Verilog Benchmarks

| Model Name | Base Model | RTLLM-v2 | VerilogEval-v2 | VerilogEval-v2-NTU | CVDP-cid02&03 |
| :--- | :--- | :---: | :---: | :---: | :---: |
| *Foundation Models:* | | | | | |
| DeepSeek-R1-0528 | -- | 68.7 | 80.9 | 86.4 | 25.6 |
| gpt-oss-120b (high) | -- | 70.0 | 83.2 | 87.9 | 27.6 |
| Qwen3-32B | -- | 55.4 | 70.3 | 76.3 | 12.8 |
| Qwen3-14B | -- | 50.0 | 64.2 | 69.5 | 12.9 |
| | | | | | |
| Qwen2.5-C-7B-I | -- | 29.3 | 31.5 | 33.6 | 7.3 |
| Qwen3-4B-T-2507 | -- | 36.4 | 48.2 | 52.5 | 12.4 |
| Qwen3-8B | -- | 40.2 | 53.7 | 57.4 | 11.9 |
| Olmo-3-7B-Think | -- | 10.4 | 7.8 | 8.9 | 1.2 |
| *Fine-tuned Models:* | | | | | |
| CodeV-R1-7B-Distill | Qwen2.5-C-7B-I | 58.5 | 66.4 | 69.6 | 19.0 |
| CodeV-R1-7B | Qwen2.5-C-7B-I | 🥉 **66.1** | **69.7** | 73.2 | 21.3 |
| **SiliconMind-V1** | Qwen2.5-C-7B-I | 63.8 | **69.7** | **73.9** | 🥉 **22.3** |
| **SiliconMind-V1** | Qwen3-4B-T-2507 | 🥇 67.9 | 🥈 76.4 | 🥇 82.0 | 🥈 23.5 |
| **SiliconMind-V1** | Qwen3-8B | 🥈 66.6 | 🥇 76.5 | 🥈 81.0 | 🥇 24.0 |
| **SiliconMind-V1** | Olmo-3-7B-Think | 63.3 | 🥉 73.5 | 🥉 79.5 | 21.2 |

<br>

**Note:** - **Bold** values denote the better-performing model between CodeV-R1 and ours using the same base model.
- Rankings among specialized models: 🥇 First, 🥈 Second, 🥉 Third.
- For brevity, we refer to *Qwen2.5-Coder-7b-Instruct* as *Qwen2.5-C-7B-I* and *Qwen3-4B-Thinking-2507* as *Qwen3-4B-T-2507*.
- **SiliconMind-V1** models' results were obtained using the Agentic Strategy, and we allow up to 3 Test/Debug Agent interactions.

## License
**SiliconMind-V1** is licensed under [Apache 2.0](https://huggingface.co/AS-SiliconMind/SiliconMind-V1-Qwen2.5-C-7B-I/blob/main/LICENSE).
<br>
The base models' licenses:
[Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct/blob/main/LICENSE),
[Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507/blob/main/LICENSE),
[Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B/blob/main/LICENSE),
[Olmo-3-7B-Think](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md) ([Responsible Use Guidelines](https://allenai.org/responsible-use)).

## Acknowledgements
We acknowledge the financial support from Academia Sinica's SiliconMind Project (AS-IAIA-114-M11). We also thank the National Center for High-Performance Computing (NCHC) for providing computational and storage resources, and Taipei-1 for providing H100 computing resources. In addition, we acknowledge financial support from the National Science and Technology Council.

## Citation

**BibTeX:**
```
@misc{Chen2026SiliconMindV1,
  title  = {{SiliconMind-V1}: Multi-Agent Distillation and Debug-Reasoning Workflows for Verilog Code Generation},
  author = {Mu-Chi Chen and Yu-Hung Kao and Po-Hsuan Huang and Shao-Chun Ho
            and Hsiang-Yu Tsou and I-Ting Wu and En-Ming Huang
            and Yu-Kai Hung and Wei-Po Hsin and Cheng Liang
            and Chia-Heng Tu and Shih-Hao Hung and H.T. Kung},
  year   = {2026},
  url    = {https://AS-SiliconMind.github.io/SiliconMind-V1}
}
```