fix profile run for vl model (#5136)

### What this PR does / why we need it?

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: 李少鹏 <lishaopeng21@huawei.com>
This commit is contained in:
shaopeng-666
2025-12-17 23:51:31 +08:00
committed by GitHub
parent 43d974c6f7
commit 39bdd4cfaa
4 changed files with 1 additions and 3 deletions

View File

@@ -6,7 +6,6 @@ tasks:
metrics:
- name: "acc,none"
value: 0.58
max_model_len: 8192
tensor_parallel_size: 2
gpu_memory_utilization: 0.7
enable_expert_parallel: True

View File

@@ -6,6 +6,5 @@ tasks:
metrics:
- name: "acc,none"
value: 0.55
max_model_len: 8192
batch_size: 32
gpu_memory_utilization: 0.7