### What this PR does / why we need it?
Temporarily fix the oom issue, will update to vllm's plan later.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
e2e&ut
- vLLM version: v0.11.0
- vLLM main:
2918c1b49c
Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>