xc-llm-ascend

Files

wangxiyuan f811a24bf0 Remove VLLM_USE_V1 (#4086 )

Drop VLLM_USE_V1 usage.  This env has been removed from vLLM already.

- vLLM version: v0.11.0
- vLLM main:
83f478bb19

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

2025-11-11 15:43:39 +08:00

models

Remove VLLM_USE_V1 (#4086 )

2025-11-11 15:43:39 +08:00

ops

[Bugfix] [MoE] fix error in deepseek when using allgather (#3824 )

2025-10-29 14:51:39 +08:00

quantization

[Feat][quantization] Support new version w4a8 dynamic quantization for Linear layers (#3311 )

2025-10-21 20:18:39 +08:00

__init__.py

[1/4][Refactor] Refactor torchair worker (#1885 )

2025-07-21 11:50:46 +08:00

torchair_attention.py

[main] remove dbo code (#3712 )

2025-10-25 15:53:01 +08:00

torchair_mla.py

[BugFix] Improve the performance of prefixcache features (#4022 )

2025-11-08 18:45:31 +08:00

torchair_model_runner.py

support qwen3-next full_decode_only mode. (#3949 )

2025-11-05 08:46:05 +08:00

torchair_mtp_proposer.py

[FEAT] Refactor spec decode to support efficient padded speculation (#3528 )

2025-10-30 16:53:05 +08:00

torchair_sfa.py

[main] remove dbo code (#3712 )

2025-10-25 15:53:01 +08:00

torchair_worker.py

[CI] Upgrade vllm to newest commit (#3182 )

2025-09-26 06:18:15 +08:00

utils.py

[BugFix] deepseek torchair adapt for torch_npu version (#3862 )

2025-10-29 22:39:34 +08:00