xc-llm-ascend

Author SHA1 Message Date

Author	SHA1	Message	Date
luomin2005	f41eeeb11e	Refactor the ops PyTorch adapter，cleanup for csrc/torch_binding.cpp (#6732 ) ### What this PR does / why we need it? Refactor the ops PyTorch adapter，cleanup for csrc/torch_binding.cpp, more details see https://github.com/vllm-project/vllm-ascend/issues/6486 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? install the new package to test the new modification, here is the result: - vLLM version: v0.15.0 - vLLM main: `9562912cea` --------- Signed-off-by: liziyu <liziyu16@huawei.com> Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com> Signed-off-by: luomin2005 <luomin2005@huawei.com> Co-authored-by: liziyu <56102866+liziyu179@users.noreply.github.com> Co-authored-by: wangxiaoteng <wangxiaoteng@huawei.com>	2026-02-24 09:12:43 +08:00
yydyzr	ff3a50d011	[Model] GLM5 adaptation (#6642 ) ### What this PR does / why we need it? GLM5 adaptation 1. use torch_npu.npu_lightning_indexer for GLM5 2. forbid eagle proposer when fullgraph mode is enabled because of bugs 3. add quatization config for GLM5 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? by ci - vLLM main: `978a37c823` --------- Signed-off-by: yydyzr <liuyuncong1@huawei.com> Signed-off-by: shenchuxiaofugui <1311027364@qq.com> Co-authored-by: shenchuxiaofugui <1311027364@qq.com>	2026-02-11 22:22:22 +08:00

luomin2005

f41eeeb11e

Refactor the ops PyTorch adapter，cleanup for csrc/torch_binding.cpp (#6732 )

### What this PR does / why we need it?
Refactor the ops PyTorch adapter，cleanup for csrc/torch_binding.cpp,
more details see
https://github.com/vllm-project/vllm-ascend/issues/6486

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
install the new package to test the new modification, here is the
result:


- vLLM version: v0.15.0
- vLLM main:
9562912cea

---------

Signed-off-by: liziyu <liziyu16@huawei.com>
Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>
Signed-off-by: luomin2005 <luomin2005@huawei.com>
Co-authored-by: liziyu <56102866+liziyu179@users.noreply.github.com>
Co-authored-by: wangxiaoteng <wangxiaoteng@huawei.com>

2026-02-24 09:12:43 +08:00

yydyzr

ff3a50d011

[Model] GLM5 adaptation (#6642 )

### What this PR does / why we need it?
GLM5 adaptation
1. use torch_npu.npu_lightning_indexer for GLM5
2. forbid eagle proposer when fullgraph mode is enabled because of bugs
3. add quatization config for GLM5
### Does this PR introduce _any_ user-facing change?
N/A
### How was this patch tested?
by ci
- vLLM main:
978a37c823

---------

Signed-off-by: yydyzr <liuyuncong1@huawei.com>
Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
Co-authored-by: shenchuxiaofugui <1311027364@qq.com>

2026-02-11 22:22:22 +08:00

2 Commits