xc-llm-ascend/vllm_ascend at 40bd6024856b340dfb0ad80f101d96f670c72991 - xc-llm-ascend - Gitea: Git with a cup of tea

EngineX/xc-llm-ascend

Files

History

Jade Zheng 40bd602485 [Feature] Use reshape_and_cache fused op (#706 )

Replace torch function with reshape_and_cache fused op for better
performance. The `reshape_and_cache` function wasn't working because it
expected torch.int32 tensor, but a torch.int64 tensor was provided.

Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>

2025-04-28 21:54:42 +08:00

..

[Feature] Use reshape_and_cache fused op (#706 )

2025-04-28 21:54:42 +08:00

[BUGFIX] main-sd-bugfix && [UT] add mtp UT (#593 )

2025-04-21 19:25:51 +08:00

device_allocator

catch ImportError when C code not compiled (#575 )

2025-04-18 18:11:49 +08:00

[BUGFIX] main-sd-bugfix && [UT] add mtp UT (#593 )

2025-04-21 19:25:51 +08:00

[Bugfix] fix import error (#600 )

2025-04-22 08:57:25 +08:00

[MTP] follow custom deepseek modeling changes to support graph mode (#636 )

2025-04-28 21:18:53 +08:00

support aclgraph (#426 )

2025-04-23 20:56:24 +08:00

[MTP] follow custom deepseek modeling changes to support graph mode (#636 )

2025-04-28 21:18:53 +08:00

support deepseek quant & mix-parallel with graphmode (#585 )

2025-04-23 16:23:25 +08:00

[MTP] follow custom deepseek modeling changes to support graph mode (#636 )

2025-04-28 21:18:53 +08:00

__init__.py

[Bugfix] Fix triton placeholder patch period (#704 )

2025-04-28 18:52:03 +08:00

envs.py

[MISC] Make vllm version configurable (#651 )

2025-04-28 14:19:06 +08:00

platform.py

[V1] Make V1 engine backward compatible (#637 )

2025-04-24 17:20:11 +08:00

utils.py

[MISC] Make vllm version configurable (#651 )

2025-04-28 14:19:06 +08:00