Files
xc-llm-ascend/tests/e2e/nightly
starmountain1997 bc1622338c [CI] Add long and short prompt tests for DeepSeek-V3.2 (#6536)
### What this PR does / why we need it?

This version has no divisibility constraint between tp and mtp+1.
However, cudagraph_capture_sizes must be a common multiple of tp and
mtp+1, with a maximum of tp * (mtp+1). Therefore, we fixed
cudagraph_capture_sizes.

We added a long-sequence test (64k input, 3k output) for the two-node
mixed deployment scenario. Due to the excessive time required for
performance benchmarking, we are only verifying functionality. The
single-node scenario is skipped because VRAM limitations prevent
launching the model with a max-model-len of 68,000.

and we also add aime2025 test for dual-node deepseek 3.2 nightly test.

### How was this patch tested?

test at nightly environment.

- vLLM version: v0.15.0
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.15.0

Signed-off-by: guozr <guozr1997@hotmail.com>
Co-authored-by: guozr <guozr1997@hotmail.com>
2026-02-26 10:58:50 +08:00
..