Commit Graph

11 Commits

Author SHA1 Message Date
Ronald
b69b04d3a9 implement model runner v2 basic framework (#5051)
### What this PR does / why we need it?
This PR aim to implement model runner v2 basic framework in vllm-ascend,
the e2e function is not guaranteed by this pr.
 
### Does this PR introduce _any_ user-facing change?
use envs.VLLM_USE_V2_MODEL_RUNNER to decide if choose model_runenr_v2.

### How was this patch tested?

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: Ronald1995 <ronaldautomobile@163.com>
2025-12-18 15:51:54 +08:00
lilinsiman
31c94b7e7b [doc][main] Correct more doc mistakes (#4958)
### What this PR does / why we need it?
Correct more doc mistakes

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

Signed-off-by: lilinsiman <lilinsiman@gmail.com>
2025-12-13 18:36:58 +08:00
lilinsiman
fc818f1509 [doc][main] Correct mistakes in doc (#4945)
### What this PR does / why we need it?
Correct mistakes in doc

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: lilinsiman <lilinsiman@gmail.com>
2025-12-12 19:17:10 +08:00
zhangyiming
66b0781840 [E2E] Refactor the e2e testcases. (#4789)
### What this PR does / why we need it?
Refactor the e2e testcases.
- tests/e2e/multicard/test_weight_loader.py: Remove the unused code.
- tests/e2e/singlecard/multi-modal/test_internvl.py: Move to accuracy
test.
- tests/e2e/singlecard/test_aclgraph.py: Rename the file.
- tests/e2e/singlecard/test_embedding_aclgraph.py : Combine with
tests/e2e/singlecard/test_bge_model.py
- tests/e2e/singlecard/test_completion_with_prompt_embeds.py: Delete
eager mode and modify model to Qwen3-0.6B
- tests/e2e/singlecard/test_quantization.py: Modify model to
Qwen3-0.6B-W8A8
- tests/e2e/singlecard/test_vlm.py: Modify model to Qwen3-VL-8B

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: menogrey <1299267905@qq.com>
2025-12-11 10:15:00 +08:00
Tjh-UKN
00ea61ec88 [feature] vllm-ascend support msprobe (eager mode dump) (#4241)
### What this PR does / why we need it?
vllm-ascend need to dump data during model execution to debug some
precision problems, here msprobe provide the corresponding abilities, so
msprobe will join vllm-ascend to make debug easier

### Does this PR introduce _any_ user-facing change?
```
'dump_config': '/path/to/config.json'
```



- vLLM version: v0.11.0
- vLLM main:
2918c1b49c

---------

Signed-off-by: Tjh-UKN <2559659915@qq.com>
2025-11-24 21:58:31 +08:00
lilinsiman
adee9dd3b1 [Info][main] Correct the mistake in information documents (#4157)
### What this PR does / why we need it?
Correct the mistake in information documents

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
ut

- vLLM version: v0.11.0
- vLLM main:
2918c1b49c

---------

Signed-off-by: lilinsiman <lilinsiman@gmail.com>
2025-11-13 15:53:58 +08:00
thonean
e38fe92f40 [Misc][Doc] Add service profiling feature with user guide (#3756)
### What this PR does / why we need it?
To support the data collection capabilities of the msServiceProfiler on
vLLM-ascned framework and enable customization of data collection points
via configuration file, a default profiling configuration has been added
to vllm-ascend, facilitating debugging and optimization for developers
and users.

### Does this PR introduce _any_ user-facing change?
None

### How was this patch tested?

- vLLM version: v0.11.0
- vLLM main:
83f478bb19

Signed-off-by: minghangc <29514143@qq.com>
2025-11-12 09:07:14 +08:00
Crazyang
f06a6cad1b [Doc] Update the modelslim website from gitee to gitcode. (#3615)
### What this PR does / why we need it?

Because the ModelSlim code repository has migrated from gitee to
gitcode, all relevant links in the repository have been updated.

[migration
notice](https://gitee.com/ascend/msit/tree/master/.%E6%9C%AC%E9%A1%B9%E7%9B%AE%E5%B7%B2%E7%BB%8F%E6%AD%A3%E5%BC%8F%E8%BF%81%E7%A7%BB%E8%87%B3%20Gitcode%20%E5%B9%B3%E5%8F%B0)

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

vLLM version: v0.11.0rc3
vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: Crazyang <im.crazyang@gmail.com>
Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>
Co-authored-by: weichen <calvin_zhu0210@outlook.com>
2025-10-23 15:38:16 +08:00
whx
0a526768f5 [Feature] Support moe multi-stream for aclgraph. (#2946)
This PR puts the calculation of shared experts into a separate stream,
overlaping with routing experts.

- vLLM version: v0.10.2
- vLLM main:
fbd6523ac0

---------

Signed-off-by: whx-sjtu <2952154980@qq.com>
2025-09-19 11:06:45 +08:00
LeeWenquan
f4e3d22432 Remove chunked_prefill_for_mla and fix ring_mla bug (#2781)
### What this PR does / why we need it?
Remove chunked prefill for mla branch in mla , and change dtype of
prefill_mask to avoid accuracy problem
### Does this PR introduce _any_ user-facing change?
NO
### How was this patch tested?

- vLLM version: v0.10.2
- vLLM main:
ef7eefe17a

---------

Signed-off-by: SunnyLee219 <3294305115@qq.com>
2025-09-18 19:43:26 +08:00
aidoczh
c32eea96b7 [Doc]Add Chinese translation for documentation (#1870)
### What this PR does / why we need it?

This PR adds a complete Chinese translation for the documentation using
PO files and the gettext toolchain. The goal is to make the
documentation more accessible to Chinese-speaking users and help the
community grow.

### Does this PR introduce any user-facing change?

Yes. This PR introduces Chinese documentation, which users can access
alongside the original English documentation. No changes to the core
code or APIs.

### How was this patch tested?

The translated documentation was built locally using the standard
documentation build process (`make html` or `sphinx-build`). I checked
the generated HTML pages to ensure the Chinese content displays
correctly and matches the original structure. No code changes were made,
so no additional code tests are required.

vLLM version: v0.9.2  
vLLM main: vllm-project/vllm@5780121

---

Please review the translation and let me know if any improvements are
needed. I am happy to update the translation based on feedback.

- vLLM version: v0.9.2
- vLLM main:
7ba34b1241

---------

Signed-off-by: aidoczh <aidoczh@163.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
2025-07-21 11:26:27 +08:00