xc-llm-ascend/ops at 24328aaf005f210a47d32e1bb140b7f93f824fe9 - xc-llm-ascend - Gitea: Git with a cup of tea

EngineX/xc-llm-ascend

Files

History

Li Wang 58adf7c8ac [Bugfix] Correctly handle the output shape in multimodal attention (#5443 )

### What this PR does / why we need it?
Fix https://github.com/vllm-project/vllm-ascend/issues/5297, for
`AscendMMEncoderAttention` forward, we should keep the output shape
consistence with the input

- vLLM version: release/v0.13.0
- vLLM main:
81786c8774

---------

Signed-off-by: wangli <wangli858794774@gmail.com>

2025-12-27 18:42:46 +08:00

..

Revert "[feat] enable hierarchical mc2 ops on A2 by default (#5300 )" (#5434 )

2025-12-27 17:06:58 +08:00

rollback causal_conv1d_fn to torch ops & update qwen3Next doc (#5391 )

2025-12-26 19:57:38 +08:00

__init__.py

[Fusion] [Graph] Add qknorm rope fusion operator (#4711 )

2025-12-17 08:53:44 +08:00

activation.py

[refact] unified soc_version code (#4359 )

2025-11-26 14:28:55 +08:00

expert_load_balancer.py

eplb redundant expert bugfix (#4291 )

2025-11-21 14:24:35 +08:00

layernorm.py

[Graph][Fusion]Add new pattern for AddRmsnormQuant with SP. (#5077 )

2025-12-18 20:25:44 +08:00

linear_op.py

Remove VLLM_ASCEND_ENABLE_DENSE_OPTIMIZE (#5272 )

2025-12-25 11:09:56 +08:00

linear.py

[bugfix] fix Error 'ValueError: Duplicate layer name' (#5280 )

2025-12-25 10:43:24 +08:00

mla.py

upgrade vLLM to main (#4608 )

2025-12-02 22:10:52 +08:00

mm_encoder_attention.py

[Bugfix] Correctly handle the output shape in multimodal attention (#5443 )

2025-12-27 18:42:46 +08:00

register_custom_ops.py

[refactor] Remove unnecessary attributes from set_ascend_forward_context (#5204 )

2025-12-23 08:49:52 +08:00

rotary_embedding.py

[CustomOp] Register AscendApplyRotaryEmb CustomOp and remove related patch (#4667 )

2025-12-23 10:04:37 +08:00

shared_weight_layer.py

[Feat] Flashcomm2 use o_shared linear (#4188 )

2025-12-11 12:43:04 +08:00

vocab_parallel_embedding.py

[Feat] Add custom Embedding tensor model parallel (#2616 )

2025-12-12 14:41:20 +08:00

weight_prefetch.py

Update torch-npu version to 2.7.1 (#3896 )

2025-10-31 17:16:31 +08:00