xc-llm-ascend

Files

realliujiaxu 778cb72556 fix bug when rotary_dim is not 128 (#2847 )

### What this PR does / why we need it?
`torch_npu.npu_apply_rotary_pos_emb` only support head_size and
rotary_dim equal 128. Error occurs when running GLM

### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?

- vLLM version: main
- vLLM main:
404c85ca72

Signed-off-by: realliujiaxu <realliujiaxu@163.com>

2025-09-12 09:49:36 +08:00

moe

Refactor tensor_parallel and comm_utils (#2814 )

2025-09-11 21:26:36 +08:00

__init__.py

[main] flashcomm_v1 optim in Qwen Dense Models (#2802 )

2025-09-08 22:52:24 +08:00

activation.py

[main] mlp weight prefetch in Qwen Dense Models (#2816 )

2025-09-11 21:20:09 +08:00

attention.py

Disaggregate prefill for kv cache register style (#950 )