[BugFix] Fix some issues caused by the ascending order of cudagraph_capture_sizes (#4338)
### What this PR does / why we need it?
In [#26016](https://github.com/vllm-project/vllm/pull/26016), vllm
change the `cudagraph_capture_sizes` to be in ascending order. This PR
fixes related issues caused by this.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
- vLLM version: v0.11.0
- vLLM main:
2918c1b49c
---------
Signed-off-by: Angazenn <supperccell@163.com>
This commit is contained in:
@@ -109,7 +109,7 @@ def _run_worker_process(
|
||||
llm = LLM(
|
||||
model=model_path,
|
||||
quantization="ascend" if "W8A8" in model_path else None,
|
||||
# enable_expert_parallel=True if "DeepSeek" in model_path else False,
|
||||
enable_expert_parallel=True if "DeepSeek" in model_path else False,
|
||||
trust_remote_code=True,
|
||||
)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user