xc-llm-ascend

Files

Li Wang 75fae619d5 [Misc] Refactor aclgraph accuracy test to use logprob-based comparison (#7455 )

### What this PR does / why we need it?

Replace text-match assertions with a two-tier logprob accuracy check:

- Prefill (token 0): assert token ID is identical between eager baseline
and compiled mode, then verify logprob matches within `atol`.
- Decode (tokens 1-2): if chosen tokens match, compare logprobs
directly; if they differ, cross-lookup the baseline token in the
compiled model's top-20 distribution and assert the assigned logprob is
within `decode_atol` (defaults to 2x atol). This tolerates minor argmax
drift caused by floating-point differences while still catching
distribution divergence.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.17.0
- vLLM main:
8a680463fa

---------

Signed-off-by: wangli <wangli858794774@gmail.com>

2026-03-23 09:08:21 +08:00

e2e

[Misc] Refactor aclgraph accuracy test to use logprob-based comparison (#7455 )

2026-03-23 09:08:21 +08:00

[Feature] Optimize Qwen3.5/Qwen3Next GDN prefill by prebuilding chunk metadata (#7487 )

2026-03-22 23:09:23 +08:00

__init__.py

[SpecDecode] Add spec decode support (#500 )

2025-04-17 20:16:32 +08:00