Files
xc-llm-ascend/vllm_ascend/sample
FuNanyang 1b5513aa91 [performance] Enhance performance after enabling min_p (#4529)
### What this PR does / why we need it?
When min_p post-processing parameters are enabled, the original vllm
implementation introduces the aclnInIndexPutImpl operator, which
performs poorly on NPU


### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
After enabling min_p to collect profiling

The performance has been greatly improved


- vLLM version: v0.11.2

---------

Signed-off-by: funanyang <985619145@qq.com>
2025-12-02 20:35:51 +08:00
..