Files
xc-llm-ascend/vllm_ascend/spec_decode
Chen Chen 848419d1ba [Bugfix] Disable the dispatch_ffn_combine kernel in MTP path (#4751)
### What this PR does / why we need it?

This PR is to fix a smoking test failure. Adjust mtp_proposer and
model_runner_v1 to route MTP decoding through the non‑fused MoE
implementation while keeping the overall inference flow unchanged.

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

Signed-off-by: mojave2 <chenchen145@huawei.com>
Co-authored-by: Mengqing Cao <cmq0113@163.com>
2025-12-09 22:14:05 +08:00
..
2025-12-08 11:02:42 +08:00
2025-12-02 22:10:52 +08:00
2025-12-02 22:10:52 +08:00