xc-llm-ascend/vllm_ascend at a7b40b09ebed0ec5c771a3883a6c9526a1bffac8 - xc-llm-ascend - Gitea: Git with a cup of tea

EngineX/xc-llm-ascend

Files

History

Wang Yixuan a7b40b09eb [BugFix]fix deepseek torchair recompile (#3678 )

### What this PR does / why we need it?
The #3624 PR fix the precision of deepseek torchair, but don't consider
the limitation of torch compile which results in the recompile, This PR
fixs this problem

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

Signed-off-by: hust17yixuan <303660421@qq.com>

2025-10-23 22:53:01 +08:00

..

[Fix] Fixes attribute error in MLA implementation (#3618 )

2025-10-23 09:12:50 +08:00

[Feat]Make full graph mode compalible with MTP (#3276 )

2025-10-17 20:19:56 +08:00

[Feat] Dynamic Batch Feature (#3490 )

2025-10-22 14:13:32 +08:00

device_allocator

[Misc]Clean up useless import from vllm (#2049 )

2025-07-28 16:01:59 +08:00

[Bugfix] fix delay free prefill req & D node support prefix cache (#3607 )

2025-10-23 20:39:14 +08:00

[CI]Add EPLB CI. (#3568 )

2025-10-21 22:58:02 +08:00

[Feat] add native kvcache offload (#3433 )

2025-10-22 14:15:49 +08:00

[Bugfix][LoRA] Fix forward error and shape mismatch when using LoRA (#3153 )

2025-09-28 17:30:50 +08:00

[Misc] Add a model loader that utilizes HCCL for weight loading (#2888 )

2025-10-23 15:56:07 +08:00

Revert "[Feat] Shared expert dp for deepseek and deepseek_mtp (#3495 )" (#3586 )

2025-10-21 22:24:30 +08:00

[Quickfix] update CachedRequestState as NewRequestData changed (#2367 )

2025-08-15 07:35:27 +08:00

[main][refactor] refactor SequenceRowParallelOp forward (#3616 )

2025-10-23 14:41:15 +08:00

[BugFix][main] Fix quantization related mtp bug with patch (#3620 )

2025-10-23 09:54:31 +08:00

[main][bugfix] Add 'layer_type' param to get_pergroup_param() for compatibility (#3682 )

2025-10-23 21:26:33 +08:00

Drop 0.10.2 (#3284 )

2025-10-09 10:28:38 +08:00

unify logic between aclgraph and torchair (#3560 )

2025-10-22 21:52:57 +08:00

[BugFix]fix deepseek torchair recompile (#3678 )

2025-10-23 22:53:01 +08:00

[Structured Output] Replace apply_grammar_bitmask() method with that in vllm to avoid maintenance (#2524 )

2025-10-23 17:26:27 +08:00

__init__.py

[Misc] Add a model loader that utilizes HCCL for weight loading (#2888 )

2025-10-23 15:56:07 +08:00

ascend_config.py

[Feat] Dynamic Batch Feature (#3490 )

2025-10-22 14:13:32 +08:00

ascend_forward_context.py

[Bugfix] fix logging and d2h bug for flash comm1 (#3505 )

2025-10-17 21:13:41 +08:00

cpu_binding.py

[main] support cpu binding (#3546 )

2025-10-21 09:17:03 +08:00

envs.py

[Feat] Flash comm allgher ep (#3334 )

2025-10-15 19:36:32 +08:00

meta_registration.py

Fix the bugs about operator registration by PyTorch Dispatcher (#2786 )

2025-09-13 11:58:52 +08:00

platform.py

[Feat] Dynamic Batch Feature (#3490 )

2025-10-22 14:13:32 +08:00

utils.py

[Bugfix][MTP] Fix performance degradation when mtp>1 (#3597 )

2025-10-22 22:04:43 +08:00