[v0.11.0-dev][Bugfix][cherry-pick]bugfix for weight load of kimi-k2 (#4190)
### What this PR does / why we need it?
This is cherry-pick from #3798
Fix kimi-k2 start bug, weight load
ERROR:https://github.com/vllm-project/vllm-ascend/issues/3785
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.11.0rc3
- vLLM main:
c9461e05a4
---------
Signed-off-by: Levi-JQ <yujinqi2@huawei.com>
Signed-off-by: menogrey <1299267905@qq.com>
Co-authored-by: Levi <54832289+Levi-JQ@users.noreply.github.com>
Co-authored-by: Levi-JQ <yujinqi2@huawei.com>
Co-authored-by: zhaozx-cn <zhaozx2116@163.com>
This commit is contained in:
8
.github/workflows/release_whl.yml
vendored
8
.github/workflows/release_whl.yml
vendored
@@ -57,7 +57,13 @@ jobs:
|
||||
- name: Print
|
||||
run: |
|
||||
lscpu
|
||||
|
||||
|
||||
- name: Free up disk space
|
||||
uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be # v1.3.1
|
||||
with:
|
||||
tool-cache: true
|
||||
docker-images: false
|
||||
|
||||
- name: Build wheel
|
||||
run: |
|
||||
ls
|
||||
|
||||
@@ -193,6 +193,11 @@ packed_modules_model_mapping = {
|
||||
["experts.0.gate_proj", "experts.0.up_proj", "experts.0.down_proj"],
|
||||
"fused_qkv_a_proj": ["q_a_proj", "kv_a_proj_with_mqa"]
|
||||
},
|
||||
"kimi_k2": {
|
||||
"gate_up_proj": ["gate_proj", "up_proj"],
|
||||
"experts":
|
||||
["experts.0.gate_proj", "experts.0.up_proj", "experts.0.down_proj"]
|
||||
},
|
||||
"deepseek_v32": {
|
||||
"gate_up_proj": ["gate_proj", "up_proj"],
|
||||
"experts":
|
||||
|
||||
Reference in New Issue
Block a user