Files
xc-llm-ascend/vllm_ascend/distributed/kv_transfer
zxr2333 ab928ed586 [v0.18.0][P/D][Feature]Layerwise connector supports Mamba prefill prefix caching (#7796)
### What this PR does / why we need it?
Mooncake layerwise connector supports Mamba prefix caching on prefiller
nodes.

### Does this PR introduce _any_ user-facing change?
Yes. Use `--enable-prefix-caching` and `--mamba-cache-mode align` to
enable mamba align mode prefix caching on P/D prefill nodes. This
function does not supports on decode nodes now.

### How was this patch tested?
By P/D E2E test.

---------

Signed-off-by: nwpu-zxr <zhouxuerong2@huawei.com>
2026-03-31 09:25:22 +08:00
..