Commit Graph

5 Commits

Author SHA1 Message Date
Pr0Wh1teGivee
d13fb0766e [Perf] add patch to optimize apply_topk_topp (#1732)
### What this PR does / why we need it?
Performance optimization for apply_top_k_top_p
### Does this PR introduce _any_ user-facing change?
Use VLLM_ASCEND_ENABLE_TOPK_TOPP_OPTIMIZATION to enable this feature
### How was this patch tested?
e2e & ut

















- vLLM version: v0.9.2
- vLLM main:
6a9e6b2abf

Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>
2025-07-11 15:32:02 +08:00
wangxiyuan
830332ebfc Clean up v0.9.1 code (#1672)
vllm has released 0.9.2. This PR drop 0.9.1 support.

- vLLM version: v0.9.1
- vLLM main:
b942c094e3

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-07-09 08:52:24 +08:00
Yikun Jiang
0c1d239df4 Add unit test local cpu guide and enable base testcase (#1566)
### What this PR does / why we need it?
Use Base test and cleanup all manaul patch code
- Cleanup EPLB config to avoid tmp test file
- Use BaseTest with global cache
- Add license
- Add a doc to setup unit test in local env 

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
2025-07-06 10:42:27 +08:00
wangxiyuan
a45dfde283 [CI] Fix FusedMoEConfig and input batch failure to recover CI (#1602)
Make CI happy

1.
c1909e7e8c
changed moeConfig init way
2.
48fb076cbc
changed input batch logic.

This PR address these change to vllm-ascend.

Closes: https://github.com/vllm-project/vllm-ascend/issues/1600

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-07-03 18:36:17 +08:00
wangxiyuan
5968dff4e0 [Build] Add build info (#1386)
Add static build_info py file to show soc and sleep mode info. It helps
to make the code clean and the error info will be more friendly for
users

This PR also added the unit test for vllm_ascend/utils.py

This PR also added the base test class for all ut in tests/ut/base.py

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-06-27 09:14:43 +08:00