xc-llm-ascend

Files

liziyu f3ea657e93 [0.11.0][Bugfix] fix delay free prefill req & D node support prefix cache (#3609 )

### What this PR does / why we need it?
Fix mooncake connector. In scenarios where TP is not equal, when the
prefill TP size is less than the number of key-value heads,
_get_remote_tp_ranks_for_req will return a list of np.arrays. Performing
an operation like int in list of np.arrays will cause an error.
Converting the list of np.arrays into a single np.array resolves this
issue.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
qwen235B
P tp16, D tp1
P tp8, D tp1
P tp4, D tp1
P tp8, D tp2


- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: liziyu <liziyu16@huawei.com>

2025-10-23 20:39:35 +08:00

cpu_offload_manager

[Feature]cpu offload connector (#1659 )

2025-09-23 14:25:05 +08:00

device_communicators

[MISC] Clean up torch_npu (#688 )

2025-04-29 18:03:38 +08:00

mooncake

Mooncake store use adxl inferface (#3350 )

2025-10-21 20:18:17 +08:00

__init__.py

【bugfix】fix connector register failed (#3335 )