xc-llm-ascend/workflows at 9ff6b0b86204bc641e7c8d7d5cfd12e2309391ef - xc-llm-ascend - Gitea: Git with a cup of tea

EngineX/xc-llm-ascend

Files

History

xuyexiong 02c26dcfc7 [Feat] Supports Aclgraph for bge-m3 (#3171 )

### What this PR does / why we need it?
[Feat] Supports Aclgraph for bge-m3

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
```
pytest -s tests/e2e/singlecard/test_embedding.py
pytest -s tests/e2e/singlecard/test_embedding_aclgraph.py
```
to start an online server with bs 10, each batch's seq length=8192, we
set --max-num-batched-tokens=8192*10 to ensure encoder is not chunked:
```
vllm serve /home/data/bge-m3 --max_model_len 1024 --served-model-name "bge-m3" --task embed --host 0.0.0.0 --port 9095 --max-num-batched-tokens 81920 --compilation-config '{"cudagraph_capture_sizes":[8192, 10240, 20480, 40960, 81920]}'
```
For bs10, each batch's seq length=8192, QPS is improved from 85 to 104,
which is a 22% improvement, lots of host bound is reduced.


- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: xuyexiong <xuyexiong@huawei.com>
Co-authored-by: wangyongjun <1104133197@qq.com>

2025-10-14 23:07:45 +08:00

..

[Core] Init vllm-ascend (#3 )

2025-02-05 10:53:12 +08:00

_accuracy_test.yaml

Add models test and add serval new models yaml (#3394 )

2025-10-12 17:27:50 +08:00

_e2e_nightly.yaml

Enable nightly test and add qwen3 32b test case (#3370 )

2025-10-12 15:46:28 +08:00

_e2e_test.yaml

[Feat] Supports Aclgraph for bge-m3 (#3171 )

2025-10-14 23:07:45 +08:00

accuracy_test.yaml

Add models test and add serval new models yaml (#3394 )

2025-10-12 17:27:50 +08:00

format_pr_body.yaml

[CI] Update vLLM to v0.11.0 (#3315 )

2025-10-09 10:41:19 +08:00

image_310p_openeuler.yml

Enable push trigger for image job (#2906 )

2025-09-13 12:31:36 +08:00

image_310p_ubuntu.yml

Enable push trigger for image job (#2906 )

2025-09-13 12:31:36 +08:00

image_a3_openeuler.yml

Enable push trigger for image job (#2906 )

2025-09-13 12:31:36 +08:00

image_a3_ubuntu.yml

Enable push trigger for image job (#2906 )

2025-09-13 12:31:36 +08:00

image_openeuler.yml

Enable push trigger for image job (#2906 )

2025-09-13 12:31:36 +08:00

image_ubuntu.yml

Enable push trigger for image job (#2906 )

2025-09-13 12:31:36 +08:00

label_merge_conflict.yml

[CI] Do not drop ready label when PR is merge conflict (#3173 )

2025-09-25 18:45:19 +08:00

labeler.yml

Bump actions/labeler from 5 to 6 (#3086 )

2025-09-22 14:07:37 +08:00

multi_node_test.yaml

[1/N][CI] Add multi node test (#3359 )

2025-10-11 14:50:46 +08:00

nightly_benchmarks.yaml

[CI] Update vLLM to v0.11.0 (#3315 )

2025-10-09 10:41:19 +08:00

pre-commit.yml

[CI] Enable main based lint check and light ci matrix (#3079 )

2025-09-22 10:37:53 +08:00

release_code.yml

Bump actions/setup-python from 5.4.0 to 6.0.0 (#2926 )

2025-09-16 14:15:10 +08:00

release_whl.yml

Bump actions/setup-python from 5.4.0 to 6.0.0 (#2926 )

2025-09-16 14:15:10 +08:00

reminder_comment.yml

Bump actions/github-script from 7 to 8 (#2803 )

2025-09-08 14:53:26 +08:00

vllm_ascend_dist.yaml

[CI] Update vLLM to v0.11.0 (#3315 )

2025-10-09 10:41:19 +08:00

vllm_ascend_doctest.yaml

Refactor ci to reuse base workflow and re-enable ut coverage (#3064 )

2025-09-21 13:27:08 +08:00

vllm_ascend_test_310p.yaml

[CI] Update vLLM to v0.11.0 (#3315 )

2025-10-09 10:41:19 +08:00

vllm_ascend_test_full_vllm_main.yaml

Refactor ci to reuse base workflow and re-enable ut coverage (#3064 )

2025-09-21 13:27:08 +08:00

vllm_ascend_test_full.yaml

[CI] Make the test_pipeline_parallel run normally in full test (#3391 )

2025-10-12 15:43:13 +08:00

vllm_ascend_test_models.yaml

Add models test and add serval new models yaml (#3394 )

2025-10-12 17:27:50 +08:00

vllm_ascend_test_nightly.yaml

Enable nightly test and add qwen3 32b test case (#3370 )

2025-10-12 15:46:28 +08:00

vllm_ascend_test_pd.yaml

Revert "Upgrade CANN version to 8.3.rc1.alpha001 (#2903 )" (#2909 )

2025-09-13 16:21:54 +08:00

vllm_ascend_test.yaml

[UT] fix skipped test_utils ut test. (#3422 )

2025-10-14 08:31:13 +08:00