初始化项目,由ModelHub XC社区提供模型
Model: qqggez/deepseek-parlay-6.7b Source: Original Platform
This commit is contained in:
72
README.md
Normal file
72
README.md
Normal file
@@ -0,0 +1,72 @@
|
||||
---
|
||||
base_model: deepseek-ai/deepseek-coder-6.7b-base
|
||||
language:
|
||||
- en
|
||||
library_name: transformers
|
||||
tags:
|
||||
- deepseek
|
||||
- code
|
||||
- finetuned
|
||||
- cpp
|
||||
- parallel-computing
|
||||
dtype: float16
|
||||
pipeline_tag: text-generation
|
||||
license: other
|
||||
---
|
||||
|
||||
# Model Card for deepseek-parlay-6.7b
|
||||
|
||||
This model is part of the **ParEVO** framework, introduced in the paper [ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution](https://huggingface.co/papers/2603.02510).
|
||||
|
||||
- **Project Website:** [https://quanquancliu.com/ParEVO/index.html](https://quanquancliu.com/ParEVO/index.html)
|
||||
- **GitHub Repository:** [https://github.com/WildAlg/ParEVO](https://github.com/WildAlg/ParEVO)
|
||||
|
||||
## Model Details
|
||||
- **Base Model:** `deepseek-ai/deepseek-coder-6.7b-base`
|
||||
- **Model Type:** C++ Parallel Code Generation Model
|
||||
- **Language:** C++
|
||||
- **Parameters:** 6.7B
|
||||
|
||||
## Intended Use
|
||||
The model is specifically fine-tuned for generating high-performance parallel algorithms for irregular data structures in C++. It understands and utilizes the composable primitives of the **ParlayLib** parallel data structures library (e.g., `filter`, `pack`, `scan`, `sort`, `reduce`) to output mathematically scalable and safe parallel code.
|
||||
|
||||
## Training Data
|
||||
The model was trained on the **Parlay-Instruct Corpus**, a dataset containing 13,820 verified tasks synthesized via an Evolutionary "Teacher-Student-Critic" pipeline. The training dataset includes:
|
||||
- Ground-truth samples covering ParlayLib's core primitives.
|
||||
- DMOJ "slow-fast" code comparison pairs, constructed to identify optimal performance transformations rather than just functional correctness.
|
||||
- Code validated with execution-based verification against a ground-truth C++ compiler oracle.
|
||||
|
||||
Training data can be found at this Github link: https://github.com/WildAlg/ParEVO
|
||||
|
||||
## Training Procedure
|
||||
- **Algorithm:** Single-stage Supervised Fine-Tuning (SFT)
|
||||
- **Method:** LoRA ($r=8$, $\alpha=16$) targeting the query and value projections
|
||||
- **Learning Rate:** $2\text{e-}4$
|
||||
- **Precision:** FP16
|
||||
- **Hardware:** NVIDIA RTX 5000 Ada
|
||||
|
||||
## License
|
||||
|
||||
The ParEVO framework and datasets use a modular licensing structure to maximize open-source adoption, while the fine-tuned model weights inherit the license of their base model.
|
||||
|
||||
### 1. Model Weights License
|
||||
The fine-tuned **`deepseek-parlay-6.7b`** model weights are a derivative work of `deepseek-ai/deepseek-coder-6.7b-base`. As such, the model weights and inference outputs are governed by the [DeepSeek License](https://github.com/deepseek-ai/DeepSeek-Coder/blob/main/LICENSE-MODEL). Users must comply with the original use-case restrictions and terms set by DeepSeek when using this model.
|
||||
|
||||
### 2. Software License (MIT License)
|
||||
All software, scripts, the Evolutionary Coding Agent (ECA), and analysis code located in the [ParEVO repository](https://github.com/WildAlg/ParEVO) are licensed under the MIT License. Copyright (c) 2026 ParEVO Authors.
|
||||
|
||||
### 3. Dataset License (CC BY 4.0)
|
||||
The Parlay-Instruct Corpus, ParEval evaluation trajectories, and DMOJ problem-solution datasets are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||||
|
||||
## Citation
|
||||
|
||||
If you use this model or the ParEVO framework in your research, please cite:
|
||||
|
||||
```bibtex
|
||||
@inproceedings{yang2026parevo,
|
||||
title={ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution},
|
||||
author={Yang, Liu and Nie, Zeyu and Liu, Andrew and Zou, Felix and Altinb{\u{k}}en, Deniz and Yazdanbakhsh, Amir and Liu, Quanquan C.},
|
||||
booktitle={arXiv Preprint},
|
||||
year={2026}
|
||||
}
|
||||
```
|
||||
Reference in New Issue
Block a user