xc-llm-ascend

Files

xuyexiong ae758dda05 [Bugfix] Fix mtp torchair in pd Disaggregation scenario (#2951 )

### What this PR does / why we need it?
1. In memory of #2509, Fix mtp torchair in pd Disaggregation scenario
2. fix mla bug in SpecDecoding Scenario， since num_decodes !=
num_decode_tokens


### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.10.2
- vLLM main:
5206ab20ba

Signed-off-by: xuyexiong <xuyexiong@huawei.com>

2025-09-17 09:07:58 +08:00

__init__.py

[Core] Make V1 work and enable V1 engine test (#389 )

2025-03-28 19:34:23 +08:00

attention_mask.py

[main][bugfix] Fix bugs and refactor cached mask generation logic (#2442 )

2025-08-27 12:07:29 +08:00

attention_v1.py

[New model] Qwen3-next support (#2917 )

2025-09-16 01:17:42 +08:00

mla_v1.py

[Bugfix] Fix mtp torchair in pd Disaggregation scenario (#2951 )

2025-09-17 09:07:58 +08:00

utils.py

[New model] Qwen3-next support (#2917 )

2025-09-16 01:17:42 +08:00