[feature] support pcp + mtp in full graph (#4572)
1. support pcp + mtp in full graph
2. pcp/dcp related mtp bugfix
3. support pcp + mtpx
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
Signed-off-by: zhangsicheng5 <zhangsicheng5@huawei.com>
This commit is contained in:
@@ -75,7 +75,7 @@ class BlockTable:
|
||||
logical_table_size = max_num_blocks_per_req
|
||||
|
||||
duplicate_size = 1
|
||||
if self.pcp_world_size > 1:
|
||||
if self.pcp_world_size * self.dcp_world_size > 1:
|
||||
duplicate_size += num_speculative_tokens
|
||||
self.block_table = self._make_buffer(max_num_reqs * duplicate_size,
|
||||
logical_table_size,
|
||||
|
||||
Reference in New Issue
Block a user