xc-llm-ascend/vllm_ascend at d9ee491f7083075f44c2ac750ee42639636f005b - xc-llm-ascend - Gitea: Git with a cup of tea

EngineX/xc-llm-ascend

Files

History

Angazenn d9ee491f70 [BugFix]Move to_list in foward_v1 with FIA earlier to build (#3185 )

### What this PR does / why we need it?
The current implementation of FIA will introduce an `to_list` operation
for actual_seq_lengths_q and seq_lens，which comsumes extra time. These
operation can be moved earlier into `build` operation of attention
metadata.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?


- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: Angazenn <supperccell@163.com>

2025-10-17 11:19:41 +08:00

..

[BugFix]Move to_list in foward_v1 with FIA earlier to build (#3185 )

2025-10-17 11:19:41 +08:00

[Feat]Qwen3 Moe supports npu_add_rms_norm_quant op by default, update op with bias, resolve conflict with weight prefetch (#3465 )

2025-10-17 09:30:51 +08:00

[BugFix] Fix ascend scheduler assert error (#3191 )

2025-09-28 18:22:08 +08:00

device_allocator

[Misc]Clean up useless import from vllm (#2049 )

2025-07-28 16:01:59 +08:00

[BugFix]GPQA Accuracy Issue Bugfix (#3476 )

2025-10-15 23:28:17 +08:00

[EPLB]Record expert map without dynamic eplb. (#3409 )

2025-10-15 14:21:15 +08:00

[Bugfix][LoRA] Fix forward error and shape mismatch when using LoRA (#3153 )

2025-09-28 17:30:50 +08:00

[BugFix] fix qwenVL quant assertion error (#3466 )

2025-10-16 17:08:00 +08:00

[Quickfix] update CachedRequestState as NewRequestData changed (#2367 )

2025-08-15 07:35:27 +08:00

[Feat]Qwen3 Moe supports npu_add_rms_norm_quant op by default, update op with bias, resolve conflict with weight prefetch (#3465 )

2025-10-17 09:30:51 +08:00

[Refactor] Adapt deepseek-v3.2 to vllm 0.11.0 (#3432 )

2025-10-15 17:48:58 +08:00

[BugFix] fix qwenVL quant assertion error (#3466 )

2025-10-16 17:08:00 +08:00

Drop 0.10.2 (#3284 )

2025-10-09 10:28:38 +08:00

[Refactor] Adapt deepseek-v3.2 to vllm 0.11.0 (#3432 )

2025-10-15 17:48:58 +08:00

Revert "[BUGFIX] Mtp torchair pd fix (#3449 )" (#3500 )

2025-10-17 09:42:48 +08:00

[Fix] Clears unused slot mappings and fix accuracy issue with MLA models when enabling FULL_DECODE_ONLY (#3482 )

2025-10-16 19:43:09 +08:00

__init__.py

[Refactor] Adapt deepseek-v3.2 to vllm 0.11.0 (#3432 )

2025-10-15 17:48:58 +08:00

ascend_config.py

[Refactor] Adapt deepseek-v3.2 to vllm 0.11.0 (#3432 )

2025-10-15 17:48:58 +08:00

ascend_forward_context.py

[Feat]Qwen3 Moe supports npu_add_rms_norm_quant op by default, update op with bias, resolve conflict with weight prefetch (#3465 )

2025-10-17 09:30:51 +08:00

envs.py

[Feat] Flash comm allgher ep (#3334 )

2025-10-15 19:36:32 +08:00

meta_registration.py

Fix the bugs about operator registration by PyTorch Dispatcher (#2786 )

2025-09-13 11:58:52 +08:00

platform.py

[Feat] Flash comm allgher ep (#3334 )

2025-10-15 19:36:32 +08:00

utils.py

[Feat]Qwen3 Moe supports npu_add_rms_norm_quant op by default, update op with bias, resolve conflict with weight prefetch (#3465 )

2025-10-17 09:30:51 +08:00