xc-llm-ascend/ut at 53ecd89e8ff405302be040a76effa8c012cbaaeb - xc-llm-ascend - Gitea: Git with a cup of tea

EngineX/xc-llm-ascend

Files

History

zhangxinyuehfad a22b532d38 [Fixbug] Fix shape not match when sliding_window and dynamic batch_size (#2830 )

### What this PR does / why we need it?
Fix shape not match when test LLM-Research/Phi-4-mini-instruct accuarcy 

### Does this PR introduce _any_ user-facing change?

Users can't set dynamic batch_size or use lm_eval test accuracy when
using models(sliding_window)

### How was this patch tested?
accuarcy of LLM-Research/Phi-4-mini-instruct is ok :
```
vllm (pretrained=LLM-Research/Phi-4-mini-instruct,max_model_len=4096,dtype=auto,tensor_parallel_size=1), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: auto
|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.8105|±  |0.0108|
|     |       |strict-match    |     5|exact_match|↑  |0.8097|±  |0.0108|
```


- vLLM version: v0.10.2
- vLLM main:
3c96e7b8a1

Signed-off-by: hfadzxy <starmoon_zhang@163.com>

2025-09-19 22:35:14 +08:00

..

[Fixbug] Fix shape not match when sliding_window and dynamic batch_size (#2830 )

2025-09-19 22:35:14 +08:00

[Test]Add unit test for compilation/acl_graph.py (#3039 )

2025-09-19 21:31:17 +08:00

main add ascend scheduler support multimodal (#2844 )

2025-09-14 09:38:51 +08:00

device_allocator

add ut for device allocator/camem and mutistream/layers (#2037 )

2025-07-31 19:17:27 +08:00

Dynamic Expert Load Balance with Zero-like-overhead (#2956 )

2025-09-17 10:36:43 +08:00

[CI] Add unit test framework (#1201 )

2025-06-16 18:32:28 +08:00

fix mooncake connector adxl hostname usage (#2824 )

2025-09-13 14:38:48 +08:00

[3/N][Refactor][Quantization]remove packed_modules_mapping from models (#3021 )

2025-09-19 20:50:14 +08:00

add ut for device allocator/camem and mutistream/layers (#2037 )

2025-07-31 19:17:27 +08:00

[Feature] Support moe multi-stream for aclgraph. (#2946 )

2025-09-19 11:06:45 +08:00

patch/worker/patch_common

[feat]: oproj tensor parallelism in pure DP and graph-mode scenarios. (#2167 )

2025-09-07 10:31:32 +08:00

[3/N][Refactor][Quantization]remove packed_modules_mapping from models (#3021 )

2025-09-19 20:50:14 +08:00

[main] add pd transfer for ascend scheduler (#2753 )

2025-09-10 08:46:39 +08:00

[Feature] Support moe multi-stream for aclgraph. (#2946 )

2025-09-19 11:06:45 +08:00

[BugFix] Async scheduling and PP compatibility with DP (#2796 )

2025-09-19 11:29:50 +08:00

__init__.py

[2/4][Refactor] Refactor torchair utils (#1892 )

2025-07-21 19:43:30 +08:00

base.py

[Feature]: implement the fusion of allreduce and matmul in prefill phase when tp is enabled (#1926 )

2025-07-28 15:13:37 +08:00

conftest.py

[1/N][CustomOp] Register activation customop instead of overwrite forward_oot (#1841 )

2025-07-18 23:07:14 +08:00

test_ascend_config.py

[Feature] Support moe multi-stream for aclgraph. (#2946 )

2025-09-19 11:06:45 +08:00

test_envs.py

[Misc] Remove redundant imported envs, using envs_ascend instead (#2193 )

2025-08-14 09:33:39 +08:00

test_platform.py

[refactor] refactor deepseek-related files (#2849 )

2025-09-16 14:13:07 +08:00

test_utils.py

[CI/UT] Fix UTs on register customop and warm up model (#2862 )

2025-09-11 11:30:16 +08:00