Files
xc-llm-ascend/tests/e2e/nightly/single_node/models/configs/DeepSeek-R1-W8A8-HBM.yaml
zhangxinyuehfad 808d00406f [v0.18.0][CI]Add rank0 process count check for DeepSeek-R1-W8A8-HBM test (#8072)
### What this PR does / why we need it?
Adds a `check_rank0_process_count` validation step to the
DeepSeek-R1-W8A8-HBM nightly single-node test.

The check verifies that after the server starts, there is **exactly 1**
`vllm serve` process running on rank0. This guards against the
regression fixed in #8041 (extra NPU context leaking on device 0),
ensuring it does not silently reappear in future releases.

#### Changes

-
**`tests/e2e/nightly/single_node/models/scripts/test_single_node.py`**:
Add `run_check_rank0_process_count` async handler. It calls `npu-smi
info` for diagnostics, then uses `psutil` to assert exactly one `vllm
serve` process exists on rank0.
-
**`tests/e2e/nightly/single_node/models/configs/DeepSeek-R1-W8A8-HBM.yaml`**:
Register `check_rank0_process_count` in the `test_content` list for the
HBM test case.

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2026-04-15 17:16:27 +08:00

46 lines
1.2 KiB
YAML

# ==========================================
# ACTUAL TEST CASES
# ==========================================
test_cases:
- name: "DeepSeek-R1-W8A8-HBM-single"
model: "vllm-ascend/DeepSeek-R1-W8A8"
envs:
HCCL_BUFFSIZE: "1024"
SERVER_PORT: "DEFAULT_PORT"
server_cmd:
- "--quantization"
- "ascend"
- "--port"
- "$SERVER_PORT"
- "--data-parallel-size"
- "8"
- "--data-parallel-size-local"
- "8"
- "--data-parallel-rpc-port"
- "13389"
- "--tensor-parallel-size"
- "2"
- "--enable-expert-parallel"
- "--seed"
- "1024"
- "--max-num-seqs"
- "32"
- "--max-model-len"
- "6000"
- "--max-num-batched-tokens"
- "6000"
- "--trust-remote-code"
- "--gpu-memory-utilization"
- "0.92"
- "--no-enable-prefix-caching"
- "--reasoning-parser"
- "deepseek_r1"
- "--enforce-eager"
- "--additional-config"
- '{"ascend_scheduler_config": {"enabled": false}, "torchair_graph_config": {"enabled": false, "enable_multistream_shared_expert": false}}'
test_content:
- completion
- check_rank0_process_count
benchmarks: