xc-llm-ascend/ut at main - xc-llm-ascend - Gitea: Git with a cup of tea

EngineX/xc-llm-ascend

Files

History

Clorist33 4f0dddc9ee [Bugfix] bugfix for moe_mlp in vllm-ascend/v0.11.0-dev (#4885 )

### What this PR does / why we need it?
This PR fixes a bug in the moe_mlp module by correcting the arguments
passed to the torch_npu.npu_dequant_swiglu_quant function.It properly
converts group_list from a cumulative sum to counts for the group_index
parameter.

### Does this PR introduce _any_ user-facing change?
No


- vLLM version: v0.12.0
- vLLM main: https://github.com/vllm-project/vllm/main

---------

Signed-off-by: tanqingshan (A)  <50050625@china.huawei.com>
Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>
Co-authored-by: tanqingshan (A) <50050625@china.huawei.com>
Co-authored-by: Mercykid-bash <ruanche0218@gmail.com>

2025-12-12 14:51:47 +08:00

..

[0.11.0] [Cherry-pick #4058 ] Fixes Qwen3-Next enable nz accuracy problem (#4056 )

2025-11-10 20:56:39 +08:00

[Test]Add unit test for compilation/acl_graph.py (#3039 )

2025-09-19 21:31:17 +08:00

[UT] fix skip ut test and enable ut test run normally (#3410 )

2025-10-20 16:30:57 +08:00

device_allocator

add ut for device allocator/camem and mutistream/layers (#2037 )

2025-07-31 19:17:27 +08:00

[Bugfix] TP size larger than KV cache head causes accuracy issues (#3366 )

2025-10-11 11:22:23 +08:00

[Bugfix] bugfix for moe_mlp in vllm-ascend/v0.11.0-dev (#4885 )

2025-12-12 14:51:47 +08:00

[CI] Add unit test framework (#1201 )

2025-06-16 18:32:28 +08:00

[P/D][BugFix][v0.11.0-dev]Fix proxy format processing errors & Layerwise connector performance optimization (#4069 )

2025-11-09 09:55:10 +08:00

[0.11.0] [Cherry-pick #4058 ] Fixes Qwen3-Next enable nz accuracy problem (#4056 )

2025-11-10 20:56:39 +08:00

add ut for device allocator/camem and mutistream/layers (#2037 )

2025-07-31 19:17:27 +08:00

[Bugfix] bugfix for moe_mlp in vllm-ascend/v0.11.0-dev (#4885 )

2025-12-12 14:51:47 +08:00

patch/worker/patch_common

[Refactor] refactor patch module (#3555 )

2025-10-21 20:19:46 +08:00

Revert "[refactor]support gatingtopk operator generalization (#4356 )" (#4873 )

2025-12-10 15:45:20 +08:00

[main] add pd transfer for ascend scheduler (#2753 )

2025-09-10 08:46:39 +08:00

【0.11.0-dev】optimization of kimi-k2 in cann8.3 (#4555 )

2025-12-09 08:49:15 +08:00

[0.11.0] [Cherry-pick #4058 ] Fixes Qwen3-Next enable nz accuracy problem (#4056 )

2025-11-10 20:56:39 +08:00

__init__.py

[2/4][Refactor] Refactor torchair utils (#1892 )

2025-07-21 19:43:30 +08:00

base.py

[Feature]: implement the fusion of allreduce and matmul in prefill phase when tp is enabled (#1926 )

2025-07-28 15:13:37 +08:00

conftest.py

[1/N][CustomOp] Register activation customop instead of overwrite forward_oot (#1841 )

2025-07-18 23:07:14 +08:00

test_ascend_config.py

[Feature] Support moe multi-stream for aclgraph. (#2946 )

2025-09-19 11:06:45 +08:00

test_envs.py

[Misc] Remove redundant imported envs, using envs_ascend instead (#2193 )

2025-08-14 09:33:39 +08:00

test_platform.py

[V0.11.0][Core] Restore scheduling logic under default configuration (#4094 )

2025-11-10 20:02:23 +08:00

test_utils.py

[0.11.0] [Cherry-pick #4058 ] Fixes Qwen3-Next enable nz accuracy problem (#4056 )

2025-11-10 20:56:39 +08:00