[ModelRunner] remove unused args (follow vllm changes) (#159)
### What this PR does / why we need it? The arg list of `Attention.forward()` is changed by https://github.com/vllm-project/vllm/pull/13555. The unused args `kv_caches` and `attn_metadata` are removed. ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. Signed-off-by: MengqingCao <cmq0113@163.com>
This commit is contained in:
@@ -1142,8 +1142,6 @@ class NPUModelRunner(NPUModelRunnerBase[ModelInputForNPUWithSamplingMetadata]):
|
|||||||
hidden_or_intermediate_states = model_executable(
|
hidden_or_intermediate_states = model_executable(
|
||||||
input_ids=model_input.input_tokens,
|
input_ids=model_input.input_tokens,
|
||||||
positions=model_input.input_positions,
|
positions=model_input.input_positions,
|
||||||
kv_caches=kv_caches,
|
|
||||||
attn_metadata=model_input.attn_metadata,
|
|
||||||
intermediate_tensors=intermediate_tensors,
|
intermediate_tensors=intermediate_tensors,
|
||||||
**MultiModalKwargs.as_kwargs(multi_modal_kwargs,
|
**MultiModalKwargs.as_kwargs(multi_modal_kwargs,
|
||||||
device=self.device),
|
device=self.device),
|
||||||
|
|||||||
Reference in New Issue
Block a user