Files
xc-llm-ascend/vllm_ascend
Zhu Yi Lin 4a849df6fa [main] support cpu binding (#3546)
### What this PR does / why we need it?

Currently, in the piecewise of aclgraph, the model will be in eagle mode
in attention, which will cause abnormal allreduce latency of O matrix.
The reason is that cpu resources will be preempted in eagle mode. So I
hope to temporarily add cpu binding to vllm-ascend.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

CI passed with new existing test.

- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

Signed-off-by: GDzhu1 <809721801@qq.com>
2025-10-21 09:17:03 +08:00
..
2025-10-19 17:06:05 +08:00
2025-10-09 10:28:38 +08:00
2025-10-21 09:17:03 +08:00
2025-10-15 19:36:32 +08:00