xc-llm-ascend/vllm_ascend at b390e0ef78afb31d5ee26fdd26e4f33e89d645b3 - xc-llm-ascend - Gitea: Git with a cup of tea

EngineX/xc-llm-ascend

Files

History

Jingchun Gao b390e0ef78 [Bugfix] Fix PP+PCP and PP+flashcomm1 bugs (#5416 )

- Fixed the computing of final hidden_states when enabling pipeline
parallel and prefill context parallel at the same time. Only in the last
PP rank, hidden_states are required and have right tensor type.
- Fixed the shape of intermediate_tensors in the dummy_run when enabling
pipeline parallel and flashcomm1. The intermediate_tensors should be
divided by tp_size. Otherwise, the moe will raise issues.
- Fixed the shape of self.intermediate_tensors for sufficient slice
space

- vLLM version: release/v0.13.0
- vLLM main:
81786c8774

---------

Signed-off-by: Jingchun Gao <gaojingchun1@huawei.com>

2026-01-26 16:53:07 +08:00

..

[Doc] 310P Documents update (#6246 )

2026-01-26 14:33:21 +08:00

_cann_ops_custom

[Kernel] add custom op GmmSwigluQuantWeightNzTensorList (#3804 )

2025-11-28 18:06:39 +08:00

Reapply "[Refactor] Unify full-graph parameter update logic (#6041 )" (#6227 ) (#6231 )

2026-01-26 09:04:54 +08:00

Reapply "[Refactor] Unify full-graph parameter update logic (#6041 )" (#6227 ) (#6231 )

2026-01-26 09:04:54 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #3 ) (#5978 )

2026-01-24 22:10:18 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #2 ) (#5977 )

2026-01-19 08:59:46 +08:00

device_allocator

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #2 ) (#5977 )

2026-01-19 08:59:46 +08:00

[Feature] Mooncake connector get remote ptp size (#5822 )

2026-01-26 14:28:33 +08:00

[EPLB][Bugfix] EPLB support fp/bf16 (#5531 )

2026-01-26 14:28:16 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #5 ) (#5996 )

2026-01-24 22:45:38 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #5 ) (#5996 )

2026-01-24 22:45:38 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #6 ) (#6001 )

2026-01-24 22:08:33 +08:00

[EPLB][Bugfix] EPLB support fp/bf16 (#5531 )

2026-01-26 14:28:16 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #6 ) (#6001 )

2026-01-24 22:08:33 +08:00

[Refactor] Quantization Module Refactor (#5738 )

2026-01-23 14:13:47 +08:00

[ops] support advanced apply_top_k_top_p without top_k constraint (#6098 )

2026-01-26 09:08:42 +08:00

Reapply "[Refactor] Unify full-graph parameter update logic (#6041 )" (#6227 ) (#6231 )

2026-01-26 09:04:54 +08:00

[Bugfix] Fix PP+PCP and PP+flashcomm1 bugs (#5416 )

2026-01-26 16:53:07 +08:00

[CI] optimize lint term (#5986 )

2026-01-22 15:46:59 +08:00

__init__.py

[Lint]Style: Convert vllm-ascend/compilation to ruff format (#5912 )

2026-01-16 20:57:46 +08:00

ascend_config.py

[Fusion] change fusion env variable (#6201 )

2026-01-24 22:49:33 +08:00

ascend_forward_context.py

[Bugfix] Avoided a bug of drafter when dp and sp are enabled (#6226 )

2026-01-25 17:45:29 +08:00

batch_invariant.py

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #2 ) (#5977 )

2026-01-19 08:59:46 +08:00

cpu_binding.py

[Lint]Style: Convert vllm-ascend/compilation to ruff format (#5912 )

2026-01-16 20:57:46 +08:00

envs.py

Default enable MLAPO (#5952 )

2026-01-22 09:26:39 +08:00

flash_common3_context.py

[Lint]Style: Convert vllm-ascend/compilation to ruff format (#5912 )

2026-01-16 20:57:46 +08:00

meta_registration.py

[Lint]Style: Convert vllm-ascend/compilation to ruff format (#5912 )

2026-01-16 20:57:46 +08:00

platform.py

[Bugfix] Add defensive check for multimodal_config (#6230 )

2026-01-25 17:39:19 +08:00

profiling_config.py

[Lint]Style: Convert vllm-ascend/compilation to ruff format (#5912 )

2026-01-16 20:57:46 +08:00

utils.py

[310P]: refactoring for 310p kvcache and some ops class (#6117 )

2026-01-24 20:34:29 +08:00