xc-llm-ascend

Files

Angazenn 019a2fe6e6 [Eagle3]enhance skipping dp allreduce and add it into eagle proposer (#6192 )

### What this PR does / why we need it?
This PR：
1. Enhances the logic of `_skip_all_reduce_across_dp_group` to skip all
cpu dp allreduce for dense models. This is also for purpose 2.
2. Adds `_skip_all_reduce_across_dp_group` into eagle_proposer. Now
models like Qwen3-235b supports eagle3 spec decode. A typical setting
for these moe models on pd disaggregation often introduce `dp_size > 1`.
This requires `set_forward_context` to call a cpu dp allreduce to
retrieve `num_tokens_across_dp` on all cases. Skipping this allreduce
greatly improves performance.

- vLLM version: v0.14.0
- vLLM main:
d68209402d

---------

Signed-off-by: Angazenn <supperccell@163.com>

2026-01-24 11:29:42 +08:00

e2e

[feature] add_rms_norm support bias (#5790 )

2026-01-23 21:09:54 +08:00

[Eagle3]enhance skipping dp allreduce and add it into eagle proposer (#6192 )

2026-01-24 11:29:42 +08:00

__init__.py

[SpecDecode] Add spec decode support (#500 )

2025-04-17 20:16:32 +08:00