xc-llm-ascend/vllm_ascend at e04a5e3dd3d41a2f0757ae0eb3b6ba17cee965c5 - xc-llm-ascend - Gitea: Git with a cup of tea

EngineX/xc-llm-ascend

Files

History

Jade Zheng e04a5e3dd3 [Bugfix] Fix race condition in d2h transfer (#3372 )

### What this PR does / why we need it?

Using non-blocking operations for device-to-host transfers can lead to
data corruption in later steps. The CPU tensor is accessed right after
the transfer is triggered, but the transfer might not be complete yet.
As a result, the data could be wrong. This problem was seen in the A3
environment during `profile_run`.

### How was this patch tested?
CI pass.

- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>

2025-10-20 18:24:21 +08:00

..

[Model][1/N] Delete deepseek v2/v3 modeling codes. (#3189 )

2025-10-20 15:31:34 +08:00

[Feat]Make full graph mode compalible with MTP (#3276 )

2025-10-17 20:19:56 +08:00

[Bugfix] Route requests requiring KVC recomputation from the decode instance to the P instance (#3448 )

2025-10-18 15:56:44 +08:00

device_allocator

[Misc]Clean up useless import from vllm (#2049 )

2025-07-28 16:01:59 +08:00

bugfix for mooncake (#3535 )

2025-10-19 17:06:05 +08:00

[BugFix]Support redundant experts in EPLB (#3473 )

2025-10-18 00:09:16 +08:00

[Bugfix][LoRA] Fix forward error and shape mismatch when using LoRA (#3153 )

2025-09-28 17:30:50 +08:00

[Model][1/N] Delete deepseek v2/v3 modeling codes. (#3189 )

2025-10-20 15:31:34 +08:00

[Quickfix] update CachedRequestState as NewRequestData changed (#2367 )

2025-08-15 07:35:27 +08:00

[Model][1/N] Delete deepseek v2/v3 modeling codes. (#3189 )

2025-10-20 15:31:34 +08:00

[main][bugfix] bugfix for minicpm models (#3527 )

2025-10-19 11:00:55 +08:00

[Model][1/N] Delete deepseek v2/v3 modeling codes. (#3189 )

2025-10-20 15:31:34 +08:00

Drop 0.10.2 (#3284 )

2025-10-09 10:28:38 +08:00

[Feat]mtp aclgraph support (#3244 )

2025-10-17 18:14:49 +08:00

[Bugfix] Fix race condition in d2h transfer (#3372 )

2025-10-20 18:24:21 +08:00

[BugFix][HybridKV] Update the check logic of reinitializing inputbatch (#3540 )

2025-10-20 15:29:48 +08:00

__init__.py

[Refactor] Adapt deepseek-v3.2 to vllm 0.11.0 (#3432 )

2025-10-15 17:48:58 +08:00

ascend_config.py

[Bugfix] Route requests requiring KVC recomputation from the decode instance to the P instance (#3448 )

2025-10-18 15:56:44 +08:00

ascend_forward_context.py

[Bugfix] fix logging and d2h bug for flash comm1 (#3505 )

2025-10-17 21:13:41 +08:00

envs.py

[Feat] Flash comm allgher ep (#3334 )

2025-10-15 19:36:32 +08:00

meta_registration.py

Fix the bugs about operator registration by PyTorch Dispatcher (#2786 )

2025-09-13 11:58:52 +08:00

platform.py

[Bugfix] Route requests requiring KVC recomputation from the decode instance to the P instance (#3448 )

2025-10-18 15:56:44 +08:00

utils.py

Add mrope op fusion (#3509 )

2025-10-18 18:08:24 +08:00