[Fix] Resolve data-parallel (DP) assertion errors in TorchAir (#2626)

### What this PR does / why we need it?
It is confirmed that `num_input_tokens` must be assigned the value of
`maybe_padded_num_tokens` under all circumstances.

### Does this PR introduce _any_ user-facing change?
None.

### How was this patch tested?
Waiting for daily test for TorchAir.
- vLLM version: v0.10.1.1
- vLLM main:
006477e60b

Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>
This commit is contained in:
yiz-liu
2025-08-29 16:06:49 +08:00
committed by GitHub
parent 600b08f754
commit aadc75c247
2 changed files with 4 additions and 4 deletions

View File

@@ -100,7 +100,7 @@ class NPUTorchairModelRunner(NPUModelRunner):
num_tokens_across_dp = torch.full((self.dp_size, ),
maybe_padded_num_tokens,
dtype=torch.int32,
device="cpu")
device="npu")
else:
maybe_padded_num_tokens = num_tokens

View File

@@ -1095,9 +1095,9 @@ class NPUModelRunner(LoRAModelRunnerMixin):
enable_dbo) = self._sync_metadata_across_dp(num_input_tokens,
with_prefill, enable_dbo)
if self.use_aclgraph:
# When using TorchAir with DP, we have other plans for padding
num_input_tokens = maybe_padded_num_tokens
# TODO: Now that num_input_tokens is basically identical with maybe_padded_num_tokens
# We should consider removing maybe_padded_num_tokens later
num_input_tokens = maybe_padded_num_tokens
# Hot-Swap lora model
if self.lora_config: