Files
deepseek-parlay-6.7b/README.md
ModelHub XC d9479206df 初始化项目,由ModelHub XC社区提供模型
Model: qqggez/deepseek-parlay-6.7b
Source: Original Platform
2026-06-20 00:39:00 +08:00

72 lines
3.5 KiB
Markdown

---
base_model: deepseek-ai/deepseek-coder-6.7b-base
language:
- en
library_name: transformers
tags:
- deepseek
- code
- finetuned
- cpp
- parallel-computing
dtype: float16
pipeline_tag: text-generation
license: other
---
# Model Card for deepseek-parlay-6.7b
This model is part of the **ParEVO** framework, introduced in the paper [ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution](https://huggingface.co/papers/2603.02510).
- **Project Website:** [https://quanquancliu.com/ParEVO/index.html](https://quanquancliu.com/ParEVO/index.html)
- **GitHub Repository:** [https://github.com/WildAlg/ParEVO](https://github.com/WildAlg/ParEVO)
## Model Details
- **Base Model:** `deepseek-ai/deepseek-coder-6.7b-base`
- **Model Type:** C++ Parallel Code Generation Model
- **Language:** C++
- **Parameters:** 6.7B
## Intended Use
The model is specifically fine-tuned for generating high-performance parallel algorithms for irregular data structures in C++. It understands and utilizes the composable primitives of the **ParlayLib** parallel data structures library (e.g., `filter`, `pack`, `scan`, `sort`, `reduce`) to output mathematically scalable and safe parallel code.
## Training Data
The model was trained on the **Parlay-Instruct Corpus**, a dataset containing 13,820 verified tasks synthesized via an Evolutionary "Teacher-Student-Critic" pipeline. The training dataset includes:
- Ground-truth samples covering ParlayLib's core primitives.
- DMOJ "slow-fast" code comparison pairs, constructed to identify optimal performance transformations rather than just functional correctness.
- Code validated with execution-based verification against a ground-truth C++ compiler oracle.
Training data can be found at this Github link: https://github.com/WildAlg/ParEVO
## Training Procedure
- **Algorithm:** Single-stage Supervised Fine-Tuning (SFT)
- **Method:** LoRA ($r=8$, $\alpha=16$) targeting the query and value projections
- **Learning Rate:** $2\text{e-}4$
- **Precision:** FP16
- **Hardware:** NVIDIA RTX 5000 Ada
## License
The ParEVO framework and datasets use a modular licensing structure to maximize open-source adoption, while the fine-tuned model weights inherit the license of their base model.
### 1. Model Weights License
The fine-tuned **`deepseek-parlay-6.7b`** model weights are a derivative work of `deepseek-ai/deepseek-coder-6.7b-base`. As such, the model weights and inference outputs are governed by the [DeepSeek License](https://github.com/deepseek-ai/DeepSeek-Coder/blob/main/LICENSE-MODEL). Users must comply with the original use-case restrictions and terms set by DeepSeek when using this model.
### 2. Software License (MIT License)
All software, scripts, the Evolutionary Coding Agent (ECA), and analysis code located in the [ParEVO repository](https://github.com/WildAlg/ParEVO) are licensed under the MIT License. Copyright (c) 2026 ParEVO Authors.
### 3. Dataset License (CC BY 4.0)
The Parlay-Instruct Corpus, ParEval evaluation trajectories, and DMOJ problem-solution datasets are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
## Citation
If you use this model or the ParEVO framework in your research, please cite:
```bibtex
@inproceedings{yang2026parevo,
title={ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution},
author={Yang, Liu and Nie, Zeyu and Liu, Andrew and Zou, Felix and Altinb{\u{k}}en, Deniz and Yazdanbakhsh, Amir and Liu, Quanquan C.},
booktitle={arXiv Preprint},
year={2026}
}
```