[feature]dcp&pcp support mlapo (#5672)
### What this PR does / why we need it?
mlapo in deepseek is a huge performance improvement in decode, this pr
support pcp & dcp with mlapo
### Does this PR introduce _any_ user-facing change?
NO
### How was this patch tested?
- vLLM version: v0.13.0
- vLLM main:
2f4e6548ef
---------
Signed-off-by: zhenwenqi2024 <zhenwenqi_2022@qq.com>
This commit is contained in:
@@ -16,8 +16,8 @@ To learn more about the theory and implementation details of context parallel, p
|
||||
Currently context parallel can be used together with most other features, supported features are as follows:
|
||||
| | Eager | Graph | Prefix <br> Cache | Chunked <br> Prefill | SpecDecode <br> (MTP) | PD <br> disaggregation | MLAPO |
|
||||
| ------- | ----- | ----- | ------ | ------ | ----- | ----- | ----- |
|
||||
| **PCP** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
|
||||
| **DCP** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
|
||||
| **PCP** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅|
|
||||
| **DCP** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
|
||||
## How to use Context Parallel
|
||||
You can enable `PCP` and `DCP` by `prefill_context_parallel_size` and `decode_context_parallel_size`, refer to the following example:
|
||||
|
||||
Reference in New Issue
Block a user