xc-llm-ascend

Files

Wan_Danfeng 5cf9ff18e9 [Performance]: Custom AscendC Kernel of Multi-Step Prepare Input (#814 )

### What this PR does / why we need it?

- According to https://github.com/vllm-project/vllm-ascend/issues/807,
we pull request for customer ascendc kernel of multi-step.
- also a bug we found in multi_step_runner.py is fixed when we use
multi-step on V0 Engine.


### Does this PR introduce _any_ user-facing change?

no user-facing change


### How was this patch tested?
we add Unit Test file and offline inference file to test the custom
ascendc kernel. See test/ops/test_multi_step.py and
examples/offline_multi_step.py

---------

Signed-off-by: wan_danfeng <wonderful199082@126.com>

2025-05-20 09:31:30 +08:00

disaggregated_prefill

[CI] add codespell CI and fix format.sh (#827 )

2025-05-12 22:04:48 +08:00

dp_offline

[MISC] Clean up torch_npu (#688 )

2025-04-29 18:03:38 +08:00

offline_disaggregated_prefill_npu.py

[Feature] Add PD separation feature (#432 )

2025-04-15 15:11:35 +08:00

offline_distributed_inference_npu.py

[CI]Add model basic accuracy test(Qwen2.5-0.5B-Instruct) (#460 )