初始化项目,由ModelHub XC社区提供模型
Model: VANTAR-AI/nuro-copilot-7b Source: Original Platform
This commit is contained in:
61
README.md
Normal file
61
README.md
Normal file
@@ -0,0 +1,61 @@
|
||||
---
|
||||
language: en
|
||||
license: apache-2.0
|
||||
base_model: Qwen/Qwen2.5-Coder-7B-Instruct
|
||||
tags:
|
||||
- spiking-neural-networks
|
||||
- neuromorphic-computing
|
||||
- code-generation
|
||||
- lora
|
||||
- unsloth
|
||||
- nuro-sdk
|
||||
pipeline_tag: text-generation
|
||||
---
|
||||
|
||||
# NeuroCopilot-7B
|
||||
|
||||
**NeuroCopilot is the first AI coding assistant fine-tuned specifically for neuromorphic computing — turning natural language into deployable spiking neural network code for Intel Loihi 2, SpiNNaker2, and Vantar Cloud.**
|
||||
Built by [Vantar AI](https://vantar.xyz) on top of Qwen2.5-Coder-7B, it bridges the gap between traditional deep learning and the next generation of brain-inspired hardware.
|
||||
|
||||
## Model Details
|
||||
|
||||
- **Base model:** Qwen/Qwen2.5-Coder-7B-Instruct
|
||||
- **Fine-tuning method:** QLoRA (r=64, alpha=128) via Unsloth
|
||||
- **Training data:** ~416 (instruction, Nuro SDK code) pairs generated via OSS-Instruct from 9,654 snippets across SpikingJelly, Intel Lava, snnTorch, Norse, BindsNET, Rockpool, Nengo, and NIR
|
||||
- **Hardware:** RTX 4090 (RunPod)
|
||||
- **Quantization:** 4-bit QLoRA during training; merged to bf16 safetensors for inference
|
||||
|
||||
## What is the Nuro SDK?
|
||||
|
||||
Nuro is a Python SDK for building, training, and deploying spiking neural networks (SNNs) to neuromorphic hardware:
|
||||
|
||||
|
||||
|
||||
## Supported Hardware Targets
|
||||
|
||||
| Target | Description |
|
||||
|--------|-------------|
|
||||
| | CUDA GPU (simulation) |
|
||||
| | Intel Loihi 2 neuromorphic chip |
|
||||
| | SpiNNaker 2 (Manchester) |
|
||||
| | Vantar Cloud (managed neuromorphic) |
|
||||
|
||||
## Usage
|
||||
|
||||
|
||||
|
||||
## Training Details
|
||||
|
||||
- **Epochs:** 3
|
||||
- **Batch size:** 2 (effective 16 with gradient accumulation)
|
||||
- **Learning rate:** 2e-4 (cosine schedule)
|
||||
- **Final train loss:** 0.4349
|
||||
- **Training time:** ~5.5 minutes on RTX 4090
|
||||
|
||||
## License
|
||||
|
||||
Apache 2.0 — same as the base Qwen2.5-Coder model.
|
||||
|
||||
## Citation
|
||||
|
||||
|
||||
Reference in New Issue
Block a user