cherry picked from https://github.com/vllm-project/vllm-ascend/pull/7740
Fixes padding problems of FIA op under max concurrency.
- vLLM version: v0.18.0
- vLLM main:
35141a7eed
Signed-off-by: Wangbingjie <wangbj1207@126.com>
cherry picked from https://github.com/vllm-project/vllm-ascend/pull/7740
Fixes padding problems of FIA op under max concurrency.
- vLLM version: v0.18.0
- vLLM main:
35141a7eed
Signed-off-by: Wangbingjie <wangbj1207@126.com>