xc-llm-ascend

Files

yydyzr ff3a50d011 [Model] GLM5 adaptation (#6642 )

### What this PR does / why we need it?
GLM5 adaptation
1. use torch_npu.npu_lightning_indexer for GLM5
2. forbid eagle proposer when fullgraph mode is enabled because of bugs
3. add quatization config for GLM5
### Does this PR introduce _any_ user-facing change?
N/A
### How was this patch tested?
by ci
- vLLM main:
978a37c823

---------

Signed-off-by: yydyzr <liuyuncong1@huawei.com>
Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
Co-authored-by: shenchuxiaofugui <1311027364@qq.com>

2026-02-11 22:22:22 +08:00

methods

[main][Quant] Remove unused rotation functions and parameters from W4A4 LAOS quantization (#6648 )

2026-02-11 16:38:45 +08:00

__init__.py

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #7 ) (#6023 )

2026-02-06 14:56:53 +08:00

compressed_tensors_config.py

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #7 ) (#6023 )

2026-02-06 14:56:53 +08:00

method_adapters.py

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #7 ) (#6023 )

2026-02-06 14:56:53 +08:00

modelslim_config.py

[Model] GLM5 adaptation (#6642 )

2026-02-11 22:22:22 +08:00