xc-llm-ascend

Files

linfeng-yuan 4af5b80606 [Scheduler] validate max_num_batched_tokens and max_model_len in AscendSchedulerConfig (#2434 )

### What this PR does / why we need it?
Add configuration check logic for ascend scheduler: if chunked_prefill
is disabled, max_num_batched_tokens couldn't be less than max_model_len,
following vLLM;

### Does this PR introduce _any_ user-facing change?
users cannot set max_num_batched_tokens smaller than max_model_len with
ascend scheduler
### How was this patch tested?
CI and vllm serving passed

- vLLM version: v0.10.0
- vLLM main:
f77a0802b7

Signed-off-by: linfeng-yuan <1102311262@qq.com>

2025-08-23 19:39:44 +08:00

test_schedule_config.py

[Scheduler] validate max_num_batched_tokens and max_model_len in AscendSchedulerConfig (#2434 )

2025-08-23 19:39:44 +08:00

test_scheduler.py

[CI] fix ci (#2464 )

2025-08-22 07:30:48 +08:00