xc-llm-ascend

Author SHA1 Message Date

Author	SHA1	Message	Date
zzhxxx	4ee58e213b	[BugFix] explicitly setting the tensor shape of otp output (#3027 ) When MTP and oprojTP are enabled, it triggers the recompilation of the torchair graph, leading to a decrease in performance, and this PR fixes this issue. - vLLM version: v0.10.2 - vLLM main: `486c5599e3` --------- Signed-off-by: zzhx1 <zzh_201018@outlook.com>	2025-09-24 18:44:15 +08:00
weijinqian0	6aa4253798	[Refactor] [SP]The sequence parallelism characteristics in the MoE and Dense models are integrated into a single solution. (#3085 ) What this PR does / why we need it? there are two sets of sp implementations for moe and dense models. One is called sequence_parallelism, and the other is flashcomm_v1. We did the following things： Merge two sets of code with the same implementation into one. Remove the implementation of sequence_parallelism, as this solution cannot support aclgraph. Does this PR introduce any user-facing change? No How was this patch tested? e2e&ut - vLLM version: v0.10.2 - vLLM main: `f225ea7dd9` --------- Signed-off-by: weijinqian_v1 <weijinqian@huawei.com> Co-authored-by: weijinqian_v1 <weijinqian@huawei.com>	2025-09-24 11:29:59 +08:00
realliujiaxu	af2a886814	refactor linear (#2867 ) ### What this PR does / why we need it? The current linear.py has the following issues: - There is redundant conditional logic in the `comm_group` and `forward` selection for classes such as `AscendMergedColumnParallelLinear`. - Inconsistent comm_group selection logic exists among `AscendMergedColumnParallelLinear`, `AscendColumnParallelLinear`, and `AscendQKVParallelLinear`. To address these two issues, this PR encapsulates `comm_group` and `forward` into classes and extracts the classes selection logic into common functions. For future additions of custom communication groups or forward methods, it will only be necessary to extend `CustomColumnParallelOp` or `CustomRowParallelOp` and add new selection logic. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: `dd39baf717` --------- Signed-off-by: realliujiaxu <realliujiaxu@163.com> Co-authored-by: weijinqian0 <weijinqian@huawei.com>	2025-09-18 14:09:19 +08:00

zzhxxx

4ee58e213b

[BugFix] explicitly setting the tensor shape of otp output (#3027 )

When MTP and oprojTP are enabled, it triggers the recompilation of the
torchair graph, leading to a decrease in performance, and this PR fixes
this issue.

- vLLM version: v0.10.2
- vLLM main:
486c5599e3

---------

Signed-off-by: zzhx1 <zzh_201018@outlook.com>

2025-09-24 18:44:15 +08:00

weijinqian0

6aa4253798

[Refactor] [SP]The sequence parallelism characteristics in the MoE and Dense models are integrated into a single solution. (#3085 )

What this PR does / why we need it?

there are two sets of sp implementations for moe and dense models. One
is called sequence_parallelism, and the other is flashcomm_v1.
We did the following things：

Merge two sets of code with the same implementation into one.
Remove the implementation of sequence_parallelism, as this solution
cannot support aclgraph.
Does this PR introduce any user-facing change?

No

How was this patch tested?

e2e&ut

- vLLM version: v0.10.2
- vLLM main:
f225ea7dd9

---------

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>
Co-authored-by: weijinqian_v1 <weijinqian@huawei.com>

2025-09-24 11:29:59 +08:00

realliujiaxu

af2a886814

refactor linear (#2867 )

### What this PR does / why we need it?
The current linear.py has the following issues:

- There is redundant conditional logic in the `comm_group` and `forward`
selection for classes such as `AscendMergedColumnParallelLinear`.

- Inconsistent comm_group selection logic exists among
`AscendMergedColumnParallelLinear`, `AscendColumnParallelLinear`, and
`AscendQKVParallelLinear`.

To address these two issues, this PR encapsulates `comm_group` and
`forward` into classes and extracts the classes selection logic into
common functions. For future additions of custom communication groups or
forward methods, it will only be necessary to extend
`CustomColumnParallelOp` or `CustomRowParallelOp` and add new selection
logic.

### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?


- vLLM version: v0.10.2
- vLLM main:
dd39baf717

---------

Signed-off-by: realliujiaxu <realliujiaxu@163.com>
Co-authored-by: weijinqian0 <weijinqian@huawei.com>

2025-09-18 14:09:19 +08:00

3 Commits