xc-llm-ascend

Files

lbk-sys c611291661 【main】SP For Qwen3 MoE (#2209 )

### What this PR does / why we need it?
Qwen3 MoE supports SP. In scenarios like AlltoAll, AlltoAllv, and MC2,
replacing AllReduce with Reduce-Scatter and AllGather achieves
computational benefits in norm operations while saving one AllGather
communication. This feature is enabled during the P-phase and delivers
notable gains in long-sequence scenarios (e.g., 16k–25k), with
performance improvements reaching 5%–10%.
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
``` 
compilation_config={
    "pass_config":{
        "enable_sequence_parallelism": True
    }
},
enable_expert_parallel=True,
```

- vLLM version: v0.10.0
- vLLM main:
9edd1db02b

---------

Signed-off-by: libaokui <libaokui@huawei.com>
Co-authored-by: libaokui <libaokui@huawei.com>

2025-08-07 09:15:49 +08:00

ISSUE_TEMPLATE

[1/N][CI] Move linting system to pre-commits hooks (#1256 )

2025-07-10 14:17:15 +08:00

workflows

【main】SP For Qwen3 MoE (#2209 )

2025-08-07 09:15:49 +08:00

actionlint.yaml

[CI]Add e2e test for 310p (#1879 )

2025-07-30 14:52:16 +08:00

dependabot.yml

[CI] Add dependabot support and labeler workflow (#162 )