62 lines
1.7 KiB
Markdown
62 lines
1.7 KiB
Markdown
|
|
---
|
||
|
|
language: en
|
||
|
|
license: apache-2.0
|
||
|
|
base_model: Qwen/Qwen2.5-Coder-7B-Instruct
|
||
|
|
tags:
|
||
|
|
- spiking-neural-networks
|
||
|
|
- neuromorphic-computing
|
||
|
|
- code-generation
|
||
|
|
- lora
|
||
|
|
- unsloth
|
||
|
|
- nuro-sdk
|
||
|
|
pipeline_tag: text-generation
|
||
|
|
---
|
||
|
|
|
||
|
|
# NeuroCopilot-7B
|
||
|
|
|
||
|
|
**NeuroCopilot is the first AI coding assistant fine-tuned specifically for neuromorphic computing — turning natural language into deployable spiking neural network code for Intel Loihi 2, SpiNNaker2, and Vantar Cloud.**
|
||
|
|
Built by [Vantar AI](https://vantar.xyz) on top of Qwen2.5-Coder-7B, it bridges the gap between traditional deep learning and the next generation of brain-inspired hardware.
|
||
|
|
|
||
|
|
## Model Details
|
||
|
|
|
||
|
|
- **Base model:** Qwen/Qwen2.5-Coder-7B-Instruct
|
||
|
|
- **Fine-tuning method:** QLoRA (r=64, alpha=128) via Unsloth
|
||
|
|
- **Training data:** ~416 (instruction, Nuro SDK code) pairs generated via OSS-Instruct from 9,654 snippets across SpikingJelly, Intel Lava, snnTorch, Norse, BindsNET, Rockpool, Nengo, and NIR
|
||
|
|
- **Hardware:** RTX 4090 (RunPod)
|
||
|
|
- **Quantization:** 4-bit QLoRA during training; merged to bf16 safetensors for inference
|
||
|
|
|
||
|
|
## What is the Nuro SDK?
|
||
|
|
|
||
|
|
Nuro is a Python SDK for building, training, and deploying spiking neural networks (SNNs) to neuromorphic hardware:
|
||
|
|
|
||
|
|
|
||
|
|
|
||
|
|
## Supported Hardware Targets
|
||
|
|
|
||
|
|
| Target | Description |
|
||
|
|
|--------|-------------|
|
||
|
|
| | CUDA GPU (simulation) |
|
||
|
|
| | Intel Loihi 2 neuromorphic chip |
|
||
|
|
| | SpiNNaker 2 (Manchester) |
|
||
|
|
| | Vantar Cloud (managed neuromorphic) |
|
||
|
|
|
||
|
|
## Usage
|
||
|
|
|
||
|
|
|
||
|
|
|
||
|
|
## Training Details
|
||
|
|
|
||
|
|
- **Epochs:** 3
|
||
|
|
- **Batch size:** 2 (effective 16 with gradient accumulation)
|
||
|
|
- **Learning rate:** 2e-4 (cosine schedule)
|
||
|
|
- **Final train loss:** 0.4349
|
||
|
|
- **Training time:** ~5.5 minutes on RTX 4090
|
||
|
|
|
||
|
|
## License
|
||
|
|
|
||
|
|
Apache 2.0 — same as the base Qwen2.5-Coder model.
|
||
|
|
|
||
|
|
## Citation
|
||
|
|
|
||
|
|
|