xc-llm-ascend

Files

Nagisa125 6764777f00 [Bugfix] Fix MTP support for lmhead_tensor_parallel_size (#3915 )

### What this PR does / why we need it?
Fix the issue of MTP being enabled and setting
Imhead_tensor_parallel_size=16 causing the inference to hang.

- vLLM version: v0.11.0
- vLLM main:
83f478bb19

Signed-off-by: wyh145 <1987244901@qq.com>

2025-10-31 10:30:28 +08:00

__init__.py

[Misc][V0 Deprecation] Remove Cache Engine Used for V0 Worker (#1878 )

2025-07-19 09:42:32 +08:00

block_table.py

[Bugfix] Fix zero attention output in qwen3-next (#3572 )

2025-10-25 09:47:03 +08:00

model_runner_v1.py

[Bugfix] Fix MTP support for lmhead_tensor_parallel_size (#3915 )

2025-10-31 10:30:28 +08:00

npu_input_batch.py

[feature] Prompt Embeddings Support for v1 Engine (#3026 )

2025-10-30 17:15:57 +08:00

worker_v1.py

Upgrade to new vllm commit (#3719 )

2025-10-25 15:36:32 +08:00