xc-llm-ascend/ut at 1d0f13c1a39b4cec26637e48d474e8b3fa0d8d30 - xc-llm-ascend - Gitea: Git with a cup of tea

EngineX/xc-llm-ascend

Files

History

LookAround0301 5ec96fd46c [long_seq_Feat] support chunk prefill (#4158 )

### What this PR does / why we need it?
1、qwen GQA attention_v1 optim
2、DeepSeek MLA refactor, all gather q -> all gather kv 
3、modelrunner refactor for chunk prefill, we remove some code not use

- vLLM version: v0.11.0
- vLLM main:
2918c1b49c

---------

Signed-off-by: LookAround <lixushi@huawei.com>
Signed-off-by: Delphine-Nic <tanwenqin@huawei.com>
Co-authored-by: Delphine-Nic <tanwenqin@huawei.com>

2025-11-14 08:43:37 +08:00

..

[long_seq_Feat] support chunk prefill (#4158 )

2025-11-14 08:43:37 +08:00

[Test]Add unit test for compilation/acl_graph.py (#3039 )

2025-09-19 21:31:17 +08:00

Upgrade to new vllm commit (#3719 )

2025-10-25 15:36:32 +08:00

device_allocator

add ut for device allocator/camem and mutistream/layers (#2037 )

2025-07-31 19:17:27 +08:00

[Feat] flashcomm_v2 optim solution (#3232 )

2025-11-10 11:01:45 +08:00

[CI]Add EPLB CI. (#3568 )

2025-10-21 22:58:02 +08:00

[CI] Add unit test framework (#1201 )

2025-06-16 18:32:28 +08:00

[feature] support pcp + mtp (in pd co-locate scenario) (#4098 )

2025-11-12 17:22:21 +08:00

model_loader/netloader

Upgrade to new vllm commit (#3719 )

2025-10-25 15:36:32 +08:00

[BugFix] Fixes Qwen3-Next enable nz accuracy problem (#4058 )

2025-11-10 20:54:57 +08:00

[Bugfix] fix mtp profile run error where main model and mtp model use different quantization (#4102 )

2025-11-13 11:02:31 +08:00

patch/worker/patch_common

[Refactor] refactor patch module (#3555 )

2025-10-21 20:19:46 +08:00

[BugFix] Fixes Qwen3-Next enable nz accuracy problem (#4058 )

2025-11-10 20:54:57 +08:00

[UT] Fix test_sample_recovered_tokens_pytorch_autoregressive (#3434 )

2025-10-24 11:20:57 +08:00

[Test]Add ut test qwen3_moe and sfa (#4121 )

2025-11-13 16:09:22 +08:00

Upgrade to 0.11.1 newest vllm commit (#3982 )

2025-11-12 23:01:19 +08:00

__init__.py

[2/4][Refactor] Refactor torchair utils (#1892 )

2025-07-21 19:43:30 +08:00

base.py

[Feature]: implement the fusion of allreduce and matmul in prefill phase when tp is enabled (#1926 )

2025-07-28 15:13:37 +08:00

conftest.py

[1/N][CustomOp] Register activation customop instead of overwrite forward_oot (#1841 )

2025-07-18 23:07:14 +08:00

test_ascend_config.py

oproj TP support acl graph (#4073 )

2025-11-11 19:39:06 +08:00

test_envs.py

[Misc] Remove redundant imported envs, using envs_ascend instead (#2193 )

2025-08-14 09:33:39 +08:00

test_platform.py

[Core] Restore scheduling logic under default configuration (#3967 )

2025-11-10 17:48:56 +08:00

test_utils.py

[BugFix] Fixes Qwen3-Next enable nz accuracy problem (#4058 )

2025-11-10 20:54:57 +08:00