xc-llm-ascend

Files

LookAround0301 d25a2c20c5 [Bugfix] Fix chunk prefill bug for long_sequence feature (#5444 )

### What this PR does / why we need it?
Fix chunk prefill bug for long_sequence feature

When there are two requests with chunk prefill enabled in the
long-sequence scenario, if one request has only 1 token during
scheduling, it will be identified as a decode request and trigger an
error. This PR fixes the issue.
Closes: https://github.com/vllm-project/vllm-ascend/issues/5445

- vLLM version: release/v0.13.0
- vLLM main:
81786c8774
---------
Signed-off-by: LookAround <lixushi@huawei.com>

2026-01-05 09:16:36 +08:00

test_accuracy.py

[CI] refect e2e ci test (#5246 )

2025-12-23 18:42:35 +08:00

test_basic.py

[E2E] Optimize the E2E test time. (#5294 )

2025-12-26 14:17:50 +08:00

test_chunked_prefill.py

[Bugfix] Fix chunk prefill bug for long_sequence feature (#5444 )

2026-01-05 09:16:36 +08:00

test_mtp.py

[feature] support pcp + mtp in full graph (#4572 )

2025-12-22 16:13:39 +08:00