xc-llm-ascend

Files

Li Wang b5f7a83927 [Doc] Upgrade multi-node doc (#4365 )

### What this PR does / why we need it?
When we are using `Ascend scheduler`, the param `max_num_batched_tokens`
should be larger than `max_model_len`, otherwise, will encountered the
follow error:
```shell
Value error, Ascend scheduler is enabled without chunked prefill feature. Argument max_num_batched_tokens (4096) is smaller than max_model_len (32768). This effectively limits the maximum sequence length to max_num_batched_tokens and makes vLLM reject longer sequences. Please increase max_num_batched_tokens or decrease max_model_len. [type=value_error, input_value=ArgsKwargs((), {'model_co...g': {'enabled': True}}}), input_type=ArgsKwargs]
```

### Does this PR introduce _any_ user-facing change?
Users/Developers who running the model according to the
[tutorial](https://docs.vllm.ai/projects/ascend/en/latest/tutorials/multi_node.html),
the parameters can be specified correctly.

### How was this patch tested?

- vLLM version: v0.11.0
- vLLM main:
2918c1b49c

---------

Signed-off-by: wangli <wangli858794774@gmail.com>

2025-11-24 10:57:50 +08:00

ISSUE_TEMPLATE

[Doc] Release note for v0.11.0rc0 (#3224 )

2025-09-30 03:26:18 +08:00

workflows

[Doc] Upgrade multi-node doc (#4365 )

2025-11-24 10:57:50 +08:00

actionlint.yaml

[1/N][CI] Add multi node test (#3359 )

2025-10-11 14:50:46 +08:00

dependabot.yml

[CI] Add dependabot support and labeler workflow (#162 )