72 lines
3.5 KiB
Markdown
72 lines
3.5 KiB
Markdown
---
|
|
base_model: deepseek-ai/deepseek-coder-6.7b-base
|
|
language:
|
|
- en
|
|
library_name: transformers
|
|
tags:
|
|
- deepseek
|
|
- code
|
|
- finetuned
|
|
- cpp
|
|
- parallel-computing
|
|
dtype: float16
|
|
pipeline_tag: text-generation
|
|
license: other
|
|
---
|
|
|
|
# Model Card for deepseek-parlay-6.7b
|
|
|
|
This model is part of the **ParEVO** framework, introduced in the paper [ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution](https://huggingface.co/papers/2603.02510).
|
|
|
|
- **Project Website:** [https://quanquancliu.com/ParEVO/index.html](https://quanquancliu.com/ParEVO/index.html)
|
|
- **GitHub Repository:** [https://github.com/WildAlg/ParEVO](https://github.com/WildAlg/ParEVO)
|
|
|
|
## Model Details
|
|
- **Base Model:** `deepseek-ai/deepseek-coder-6.7b-base`
|
|
- **Model Type:** C++ Parallel Code Generation Model
|
|
- **Language:** C++
|
|
- **Parameters:** 6.7B
|
|
|
|
## Intended Use
|
|
The model is specifically fine-tuned for generating high-performance parallel algorithms for irregular data structures in C++. It understands and utilizes the composable primitives of the **ParlayLib** parallel data structures library (e.g., `filter`, `pack`, `scan`, `sort`, `reduce`) to output mathematically scalable and safe parallel code.
|
|
|
|
## Training Data
|
|
The model was trained on the **Parlay-Instruct Corpus**, a dataset containing 13,820 verified tasks synthesized via an Evolutionary "Teacher-Student-Critic" pipeline. The training dataset includes:
|
|
- Ground-truth samples covering ParlayLib's core primitives.
|
|
- DMOJ "slow-fast" code comparison pairs, constructed to identify optimal performance transformations rather than just functional correctness.
|
|
- Code validated with execution-based verification against a ground-truth C++ compiler oracle.
|
|
|
|
Training data can be found at this Github link: https://github.com/WildAlg/ParEVO
|
|
|
|
## Training Procedure
|
|
- **Algorithm:** Single-stage Supervised Fine-Tuning (SFT)
|
|
- **Method:** LoRA ($r=8$, $\alpha=16$) targeting the query and value projections
|
|
- **Learning Rate:** $2\text{e-}4$
|
|
- **Precision:** FP16
|
|
- **Hardware:** NVIDIA RTX 5000 Ada
|
|
|
|
## License
|
|
|
|
The ParEVO framework and datasets use a modular licensing structure to maximize open-source adoption, while the fine-tuned model weights inherit the license of their base model.
|
|
|
|
### 1. Model Weights License
|
|
The fine-tuned **`deepseek-parlay-6.7b`** model weights are a derivative work of `deepseek-ai/deepseek-coder-6.7b-base`. As such, the model weights and inference outputs are governed by the [DeepSeek License](https://github.com/deepseek-ai/DeepSeek-Coder/blob/main/LICENSE-MODEL). Users must comply with the original use-case restrictions and terms set by DeepSeek when using this model.
|
|
|
|
### 2. Software License (MIT License)
|
|
All software, scripts, the Evolutionary Coding Agent (ECA), and analysis code located in the [ParEVO repository](https://github.com/WildAlg/ParEVO) are licensed under the MIT License. Copyright (c) 2026 ParEVO Authors.
|
|
|
|
### 3. Dataset License (CC BY 4.0)
|
|
The Parlay-Instruct Corpus, ParEval evaluation trajectories, and DMOJ problem-solution datasets are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
|
|
|
## Citation
|
|
|
|
If you use this model or the ParEVO framework in your research, please cite:
|
|
|
|
```bibtex
|
|
@inproceedings{yang2026parevo,
|
|
title={ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution},
|
|
author={Yang, Liu and Nie, Zeyu and Liu, Andrew and Zou, Felix and Altinb{\u{k}}en, Deniz and Yazdanbakhsh, Amir and Liu, Quanquan C.},
|
|
booktitle={arXiv Preprint},
|
|
year={2026}
|
|
}
|
|
``` |