[Worker] Implement update max_model_len interface for NPUWorker (#6193)

### What this PR does / why we need it?
This patch purpose to add the `update_max_model_len` interface.

- vLLM version: v0.14.0
- vLLM main:
d68209402d

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
This commit is contained in:
Li Wang
2026-01-26 09:03:33 +08:00
committed by GitHub
parent ca297eb57f
commit 63adbedb7a
3 changed files with 114 additions and 0 deletions

View File

@@ -92,6 +92,7 @@ jobs:
# We found that if running aclgraph tests in batch, it will cause AclmdlRICaptureBegin error. So we run
# the test separately.
# basic
pytest -sv --durations=0 tests/e2e/singlecard/test_auto_fit_max_mode_len.py
pytest -sv --durations=0 tests/e2e/singlecard/test_aclgraph_accuracy.py
pytest -sv --durations=0 tests/e2e/singlecard/test_aclgraph_mem.py
pytest -sv --durations=0 tests/e2e/singlecard/test_async_scheduling.py