Pleaplusone
1a1f9a6d89
port deepseekv2 and mtp to main branch ( #429 )
...
### What this PR does / why we need it?
This PR ports all the deepseek graph mode code and mtp code from v0.7.3
to the main branch
---------
Signed-off-by: SidaoY <1024863041@qq.com >
Signed-off-by: linfeng-yuan <1102311262@qq.com >
Signed-off-by: Yizhou Liu <liuyizhou5@h-partners.com >
Signed-off-by: mengwei805 <mengwei25@huawei.com >
Signed-off-by: libaokui <libaokui@huawei.com >
Signed-off-by: q00832892 <qiaoyang19@huawei.com >
Signed-off-by: ganyi <pleaplusone.gy@gmail.com >
Co-authored-by: SidaoY <1024863041@qq.com >
Co-authored-by: linfeng-yuan <1102311262@qq.com >
Co-authored-by: Yizhou Liu <liuyizhou5@h-partners.com >
Co-authored-by: mengwei805 <mengwei25@huawei.com >
Co-authored-by: libaokui <libaokui@huawei.com >
2025-04-19 17:38:18 +08:00
wangxiyuan
42c7fbb10e
[Misc] Fix import error and address nits to make CI happy ( #563 )
...
1. Add `vllm_version_is` function to check vllm version.
2. `ensure_kv_transfer_initialized` and `get_kv_transfer_group ` have
been moved to other place in vllm main branch via
3408e47159
, this patch fix the import error.
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com >
2025-04-18 12:23:32 +08:00
hfadzxy
9935d45728
[CI]Add model basic accuracy test(Qwen2.5-0.5B-Instruct) ( #460 )
...
### What this PR does / why we need it?
Add model basic accuracy test(Qwen2.5-0.5B-Instruct)
Signed-off-by: hfadzxy <starmoon_zhang@163.com >
2025-04-17 14:59:56 +08:00
Huazhong Ji
c3d1a3782a
Add pyhccl ( #503 )
...
This is the first step to support trl vllm serve on Ascend NPU
https://github.com/vllm-project/vllm-ascend/issues/459 .
This PR can work properly only when
https://github.com/vllm-project/vllm/pull/16464 is merged into vLLM.
---------
Signed-off-by: hzji210@gmail.com <hzji210@gmail.com >
2025-04-17 14:57:52 +08:00
wangxiyuan
bbe7ccd366
[MISC] Add patch module ( #526 )
...
This PR added patch module for vllm
1. platform patch: the patch will be registered when load the platform
2. worker patch: the patch will be registered when worker is started.
The detail is:
1. patch_common: patch for main and 0.8.4 version
4. patch_main: patch for main verison
5. patch_0_8_4: patch for 0.8.4 version
2025-04-16 09:28:58 +08:00
wangxiyuan
f6af1d2471
[MISC] fix logger ( #515 )
...
logger in vllm-ascend doesn't work. This PR fix the issue.
Fix: https://github.com/vllm-project/vllm-ascend/issues/431
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com >
2025-04-15 10:18:05 +08:00
Tony
4c9d78a035
support multistep decode ( #299 )
...
Add multi step scheduler support for vllm-ascend
Signed-off-by: new-TonyWang <wangtonyyu222@gmail.com >
2025-03-11 19:20:06 +08:00
whx
8fc5dc966a
[Worker] Register mindie_turbo while initializing NPUWorker ( #13 )
...
Add `try_register_lib` and import mindie-turbo when init.
---------
Signed-off-by: hw_whx <wanghexiang7@huawei.com >
Co-authored-by: hw_whx <wanghexiang7@huawei.com >
2025-02-07 16:47:17 +08:00