Commit Graph

4 Commits

Author SHA1 Message Date
Pr0Wh1teGivee
2fda60464c [Perf] Use fused ops npu_top_k_top_p (#1308)
### What this PR does / why we need it?
Use fused ops torch_npu.npu_top_k_top_p(logits, p, k) when p and k are
not None, otherwise fallback to the original one. The replacement will
take place automatically when `VLLM_ASCEND_ENABLE_TOPK_OPTIMIZE=1` .

This patch are using `npu_top_k_top_p` which required
torch_npu>=2.5.1.post1.dev20250619

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Tested by DeepSeek R1 and UT passed

Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>
2025-06-25 20:59:06 +08:00
wangxiyuan
b350edae9a [UT] refactor test_expert_load_balancer and fix broken CI (#1293)
refactor test_expert_load_balancer to keep the ut code style

This PR also fixed the break change from
https://github.com/vllm-project/vllm/pull/16188/files#diff-e2942ece30a5c580437694ffb964bfc664b510c59244c08e5921b8f5cefb4280

This is just a quick fix. We'll support embedding on V1 later

Closes: https://github.com/vllm-project/vllm-ascend/issues/1299

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-06-20 01:02:52 +08:00
songshanhu07
ebb2a70dbb static EPLB fix bug, add unit test (#1186)
<!--  Thanks for sending a pull request!

BEFORE SUBMITTING, PLEASE READ
https://docs.vllm.ai/en/latest/contributing/overview.html

-->
### What this PR does / why we need it?
<!--
- Please clarify what changes you are proposing. The purpose of this
section is to outline the changes and how this PR fixes the issue.
If possible, please consider writing useful notes for better and faster
reviews in your PR.

- Please clarify why the changes are needed. For instance, the use case
and bug description.

- Fixes #
-->
1.add static EPLB unit test
2.fix bug: Tensor cannot be directly judged by if statements
### Does this PR introduce _any_ user-facing change?
<!--
Note that it means *any* user-facing change including all aspects such
as API, interface or other behavior changes.
Documentation-only updates are not considered user-facing changes.
-->

### How was this patch tested?
<!--
CI passed with new added/existing test.
If it was tested in a way different from regular unit tests, please
clarify how you tested step by step, ideally copy and paste-able, so
that other reviewers can test and check, and descendants can verify in
the future.
If tests were not added, please describe why they were not added and/or
why it was difficult to add.
-->
Run the unit test.

---------

Signed-off-by: songshanhu07 <1763685535@qq.com>
2025-06-18 19:46:56 +08:00
wangxiyuan
69b817ed65 [CI] Add unit test framework (#1201)
This PR added the unit test framework to enable ut for vLLM Ascend. Unit
test runs on CPU machines. It'll be ran once lint check is passed the
same as e2e test.

For unit test, this PR created a new folder called `ut` under `tests`
module. All the test file in `ut` should keep the same with the code in
`vllm-ascend`. The file name should be start with `test_` prefix. For
example, in this PR. the `test_ascend_config.py` is added for
`ascend_config.py` test.

A new fille `worker/test_worker_v1.py` is also added as the placeholder.
This file should be the unit test for `vllm-ascend/worker/worker_v1.py`.

Additional, a new `fake_weight` folder is added, it contains the
config.json from `facebook/opt-125m`, so that the test will not always
visit huggingface.

TODO:
We should add all the unit test file one by one in the future.

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-06-16 18:32:28 +08:00