[Fix] Resolve data-parallel (DP) assertion errors in TorchAir (#2626)
### What this PR does / why we need it?
It is confirmed that `num_input_tokens` must be assigned the value of
`maybe_padded_num_tokens` under all circumstances.
### Does this PR introduce _any_ user-facing change?
None.
### How was this patch tested?
Waiting for daily test for TorchAir.
- vLLM version: v0.10.1.1
- vLLM main:
006477e60b
Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>
This commit is contained in:
@@ -100,7 +100,7 @@ class NPUTorchairModelRunner(NPUModelRunner):
|
|||||||
num_tokens_across_dp = torch.full((self.dp_size, ),
|
num_tokens_across_dp = torch.full((self.dp_size, ),
|
||||||
maybe_padded_num_tokens,
|
maybe_padded_num_tokens,
|
||||||
dtype=torch.int32,
|
dtype=torch.int32,
|
||||||
device="cpu")
|
device="npu")
|
||||||
else:
|
else:
|
||||||
maybe_padded_num_tokens = num_tokens
|
maybe_padded_num_tokens = num_tokens
|
||||||
|
|
||||||
|
|||||||
@@ -1095,9 +1095,9 @@ class NPUModelRunner(LoRAModelRunnerMixin):
|
|||||||
enable_dbo) = self._sync_metadata_across_dp(num_input_tokens,
|
enable_dbo) = self._sync_metadata_across_dp(num_input_tokens,
|
||||||
with_prefill, enable_dbo)
|
with_prefill, enable_dbo)
|
||||||
|
|
||||||
if self.use_aclgraph:
|
# TODO: Now that num_input_tokens is basically identical with maybe_padded_num_tokens
|
||||||
# When using TorchAir with DP, we have other plans for padding
|
# We should consider removing maybe_padded_num_tokens later
|
||||||
num_input_tokens = maybe_padded_num_tokens
|
num_input_tokens = maybe_padded_num_tokens
|
||||||
|
|
||||||
# Hot-Swap lora model
|
# Hot-Swap lora model
|
||||||
if self.lora_config:
|
if self.lora_config:
|
||||||
|
|||||||
Reference in New Issue
Block a user