[ModelRunner] remove unused args (follow vllm changes) (#159)

### What this PR does / why we need it?
The arg list of `Attention.forward()` is changed by
https://github.com/vllm-project/vllm/pull/13555.
The unused args `kv_caches` and `attn_metadata` are removed.

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
CI passed with existing test.

Signed-off-by: MengqingCao <cmq0113@163.com>
This commit is contained in:
Mengqing Cao
2025-02-25 17:51:09 +08:00
committed by GitHub
parent 51ae37b22a
commit 79fbb20b4d

View File

@@ -1142,8 +1142,6 @@ class NPUModelRunner(NPUModelRunnerBase[ModelInputForNPUWithSamplingMetadata]):
hidden_or_intermediate_states = model_executable(
input_ids=model_input.input_tokens,
positions=model_input.input_positions,
kv_caches=kv_caches,
attn_metadata=model_input.attn_metadata,
intermediate_tensors=intermediate_tensors,
**MultiModalKwargs.as_kwargs(multi_modal_kwargs,
device=self.device),