xc-llm-ascend

Files

lidenghui1110 332b547728 [Bugfix] support mtp kv transfer and pp partition by hand in kv transfer (#4892 )

### What this PR does / why we need it?
Current mooncake connector has following problems with PP and MTP
enabled:
1. MTP layer kv caches are not transfered, it may cause decreasing of
accept ratio: This PR add MTP layer indices for last PP stage after
calculating end_layer in transfer_kv_cache
2. While MTP enabled, PP layers divided by default may cause imbalance
between stages, we need to use `VLLM_PP_LAYER_PARTITION` environment to
make it balance by hand, but in mooncake connector kv transfer, decode
doesn't know the partition of prefill node: This PR add config
`pp_layer_partition` in `kv_connector_extra_config` to make decode node
acquire the partition information of prefill node.

### Does this PR introduce _any_ user-facing change?
When prefill using `VLLM_PP_LAYER_PARTITION` environment, add
`pp_layer_partition` in `kv_connector_extra_config` like below:
```
export VLLM_PP_LAYER_PARTITION=33,28
"kv_connector_extra_config": {
    "use_ascend_direct": true,
    "prefill": {
            "dp_size": 1,
            "tp_size": 8,
            "pp_size": 2,
            "pp_layer_partition": "33,28"
     },
     "decode": {
            "dp_size": 16,
            "tp_size": 1,
            "pp_size": 1
     }
}
```

### How was this patch tested?

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: lidenghui <lidenghui1110@gmail.com>

2025-12-11 17:23:21 +08:00

test_mooncake_connector.py

[Bugfix] support mtp kv transfer and pp partition by hand in kv transfer (#4892 )

2025-12-11 17:23:21 +08:00

test_mooncake_layerwise_connector.py

Remove useless env (#4858 )

2025-12-11 06:51:07 +08:00

test_remote_decode_lifecycle.py

[Quickfix] update CachedRequestState as NewRequestData changed (#2367 )

2025-08-15 07:35:27 +08:00

test_remote_prefill_lifecycle.py

Fix some ci issue and refactor modelrunner (#2445 )

2025-08-20 09:01:04 +08:00

utils.py

[P/D][main]Offline the llmdatadist connector related parts of the code and files. (#4780 )

2025-12-09 22:36:43 +08:00