xc-llm-ascend

Author	SHA1	Message	Date
guanguan0308	44ef9a36ac	[fix]: fix precision issue in dispatch_ffn_combine_bf16 and remove redundant sync (#7198 ) ### What this PR does / why we need it? Fix the precision issue in dispatch_ffn_combine_bf16 operator. Remove redundant synchronization operations in dispatch_ffn_combine operator. - vLLM version: v0.16.0 - vLLM main: `4034c3d32e` --------- Signed-off-by: guanguan0308 <1546542263@qq.com>	2026-03-23 10:14:03 +08:00
guanguan0308	4b4961ba5f	[fix]Resolve compilation errors that occur when building versions subsequent to b020 (#7059 ) ### What this PR does / why we need it? Resolve compilation errors that occur when building versions subsequent to b020： Root Cause During operator compilation, we previously modified the names of structs HcclOpResParam and HcclRankRelationResV2 in the moe_distribute_base.h file. After version b020, moe_distribute_base.h was updated with additional code that references these two structs. This resulted in compilation errors, as renaming the structs alone broke the newly added references to them. Solution we have added the moe_distribute_base.h file to the operator implementation. This avoids compilation errors caused by updates to this file in the CANN framework. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.16.0 - vLLM main: `4034c3d32e` Signed-off-by: guanguan0308 <1546542263@qq.com>	2026-03-09 16:09:35 +08:00
xulei	1e77077788	[Bugfix][DispatchFFNCombine] resolve vec error caused by unaligned UB access (#6707 ) ### What this PR does / why we need it? 1. Fix a vec error caused by unaligned UB accesss in the DispatchFFNCombine; 2. Fix expert_token_nums tensor defined on host instead of NPU in moe_comm_method.py 3. Fix multi-core copy issue of expert_token_nums in dispatchffnCombine op (single aiv copy is sufficient) ### Does this PR introduce _any_ user-facing change? No, this PR does not introduce any user-facing changes. The fix only addresses internal memory access logic and does not modify any public APIs, interfaces, or user-visible behaviors. ### How was this patch tested? `export VLLM_ASCEND_ENABLE_FUSED_MC2=1` vLLM version: v0.15.0 - vLLM version: v0.15.0 - vLLM main: `9562912cea` Signed-off-by: xulei_ict <xulei292@huawei.com> Co-authored-by: xulei_ict <xulei292@huawei.com>	2026-02-14 10:32:50 +08:00
guanguan0308	dffac6db73	[Refactor] Add expert processed token count output for DispatchFFNCombine/DispatchFFNCombineBF16 (#6402 ) ### What this PR does / why we need it? Add New Output for Expert Token Count An additional output tensor expert_token_nums is added to both operators to meet the requirement of tracking token distribution among experts: Tensor Name: expert_token_nums Dimension: 1D tensor Shape: (local_expert_num,) Data Type: int32 Semantics: Represents the number of tokens actually received by each expert on the current card. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.14.1 - vLLM main: `dc917cceb8` --------- Signed-off-by: guanguan0308 <1546542263@qq.com> Signed-off-by: guanguan0308 <162653673+guanguan0308@users.noreply.github.com>	2026-02-03 10:41:06 +08:00
xulei	77ea873224	fix: resolve sync bug in DispathFFNCombine when expert num per card is 32 (#6416 ) ### What this PR does / why we need it? Fix the synchronization deadlock issue in DispathFFNCombine module that occurs on NPU cards when the number of experts per card exceeds 16 (the bug manifests prominently when set to 32/128). ### Does this PR introduce _any_ user-facing change? No, this is a bug fix for internal synchronization logic specific to NPU expert dispatch, with no impact on external APIs, interfaces, or end-user behaviors. - vLLM version: v0.14.1 - vLLM main: `dc917cceb8` Signed-off-by: xulei_ict <xulei292@huawei.com> Co-authored-by: xulei_ict <xulei292@huawei.com>	2026-01-30 21:21:20 +08:00
Li Wang	c26ad78f86	[CI][lint] Add rule `codespell` back (#6236 ) ### What this PR does / why we need it? After removing codepsell a while, we discovered that typo had a problem correctly recognizing certain misspelled words, so I suggested adding it back. - vLLM version: v0.14.1 - vLLM main: `d68209402d` --------- Signed-off-by: wangli <wangli858794774@gmail.com>	2026-01-26 14:12:33 +08:00
guanguan0308	1ed9524763	add dispath_ffn_combine_bf16 (#5866 ) ### What this PR does / why we need it? add dispath_ffn_combine_bf16 - vLLM version: v0.13.0 - vLLM main: `bde38c11df` --------- Signed-off-by: guanguan0308 <1546542263@qq.com>	2026-01-21 09:30:30 +08:00

7 Commits