xc-llm-ascend

Author SHA1 Message Date

Author	SHA1	Message	Date
luomin2005	f41eeeb11e	Refactor the ops PyTorch adapter，cleanup for csrc/torch_binding.cpp (#6732 ) ### What this PR does / why we need it? Refactor the ops PyTorch adapter，cleanup for csrc/torch_binding.cpp, more details see https://github.com/vllm-project/vllm-ascend/issues/6486 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? install the new package to test the new modification, here is the result: - vLLM version: v0.15.0 - vLLM main: `9562912cea` --------- Signed-off-by: liziyu <liziyu16@huawei.com> Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com> Signed-off-by: luomin2005 <luomin2005@huawei.com> Co-authored-by: liziyu <56102866+liziyu179@users.noreply.github.com> Co-authored-by: wangxiaoteng <wangxiaoteng@huawei.com>	2026-02-24 09:12:43 +08:00
Li Wang	c26ad78f86	[CI][lint] Add rule `codespell` back (#6236 ) ### What this PR does / why we need it? After removing codepsell a while, we discovered that typo had a problem correctly recognizing certain misspelled words, so I suggested adding it back. - vLLM version: v0.14.1 - vLLM main: `d68209402d` --------- Signed-off-by: wangli <wangli858794774@gmail.com>	2026-01-26 14:12:33 +08:00
Song Mingyang	18b90b501d	[kernel] add AscendC op: lightning_indexer and sparse_flash_attention (#4625 ) ### What this PR does / why we need it? Provide high-performance AscendC operators lightning_indexer and sparse_flash_attention to boost the execution performance of the DeepSeek v3.2 model. Meanwhile, adapt the two AscendC operators to vllm-ascend framework. ### Does this PR introduce _any_ user-facing change? No (only underlying operator optimizations, with no user-facing changes) ### How was this patch tested? - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: MingYang119 <songmingyang@huawei.com>	2025-12-03 09:53:10 +08:00

luomin2005

f41eeeb11e

Refactor the ops PyTorch adapter，cleanup for csrc/torch_binding.cpp (#6732 )

### What this PR does / why we need it?
Refactor the ops PyTorch adapter，cleanup for csrc/torch_binding.cpp,
more details see
https://github.com/vllm-project/vllm-ascend/issues/6486

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
install the new package to test the new modification, here is the
result:


- vLLM version: v0.15.0
- vLLM main:
9562912cea

---------

Signed-off-by: liziyu <liziyu16@huawei.com>
Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>
Signed-off-by: luomin2005 <luomin2005@huawei.com>
Co-authored-by: liziyu <56102866+liziyu179@users.noreply.github.com>
Co-authored-by: wangxiaoteng <wangxiaoteng@huawei.com>

2026-02-24 09:12:43 +08:00

Li Wang

c26ad78f86

[CI][lint] Add rule codespell back (#6236 )

### What this PR does / why we need it?
After removing codepsell a while, we discovered that typo had a problem
correctly recognizing certain misspelled words, so I suggested adding it
back.

- vLLM version: v0.14.1
- vLLM main:
d68209402d

---------

Signed-off-by: wangli <wangli858794774@gmail.com>

2026-01-26 14:12:33 +08:00

Song Mingyang

18b90b501d

[kernel] add AscendC op: lightning_indexer and sparse_flash_attention (#4625 )

### What this PR does / why we need it?
Provide high-performance AscendC operators lightning_indexer and
sparse_flash_attention to boost the execution performance of the
DeepSeek v3.2 model. Meanwhile, adapt the two AscendC operators to
vllm-ascend framework.

### Does this PR introduce _any_ user-facing change?
No (only underlying operator optimizations, with no user-facing changes)

### How was this patch tested?

- vLLM version: v0.11.2
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

Signed-off-by: MingYang119 <songmingyang@huawei.com>

2025-12-03 09:53:10 +08:00

3 Commits