### What this PR does / why we need it?
When min_p post-processing parameters are enabled, the original vllm
implementation introduces the aclnInIndexPutImpl operator, which
performs poorly on NPU
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
After enabling min_p to collect profiling
The performance has been greatly improved
- vLLM version: v0.11.2
---------
Signed-off-by: funanyang <985619145@qq.com>