xc-llm-ascend/vllm_ascend at 905f0764e047dff6ae096149a3015cf9c84aa7d7 - xc-llm-ascend - Gitea: Git with a cup of tea

EngineX/xc-llm-ascend

Files

History

lilinsiman 9564c6bb5d [main][bugfix] Fix spec acceptance rate problem in vllm_0.15.0 (#6606 )

### What this PR does / why we need it?
The speculative inference acceptance rate decreases after the vllm
version is upgraded to v0.15.0. This issue is resolved.

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
UT and tests case

- vLLM version: v0.15.0
- vLLM main:
d7e17aaacd

---------

Signed-off-by: lilinsiman <lilinsiman@gmail.com>

2026-02-09 21:33:58 +08:00

..

[Refactor]310p_e2e test case update (#6539 )

2026-02-07 09:28:37 +08:00

_cann_ops_custom

[Kernel] add custom op GmmSwigluQuantWeightNzTensorList (#3804 )

2025-11-28 18:06:39 +08:00

[Feat](sfa,dcp) support dcp for sfa (#6563 )

2026-02-09 18:52:25 +08:00

[main2main] upgrade vllm main 0202 (#6560 )

2026-02-05 19:31:17 +08:00

[P/D] layerwise connector support recompute scheduler (#5900 )

2026-02-07 15:24:42 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #2 ) (#5977 )

2026-01-19 08:59:46 +08:00

device_allocator

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #2 ) (#5977 )

2026-01-19 08:59:46 +08:00

[Refactor]refactor p2p connector (#6551 )

2026-02-07 09:27:15 +08:00

[EPLB][Bugfix] EPLB support fp/bf16 (#5531 )

2026-01-26 14:28:16 +08:00

[main2main] upgrade vllm main 0202 (#6560 )

2026-02-05 19:31:17 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #5 ) (#5996 )

2026-01-24 22:45:38 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #6 ) (#6001 )

2026-01-24 22:08:33 +08:00

[BugFix] Add support for rotary_dim parameter when using partial rope in rotary_embedding (#6581 )

2026-02-09 17:17:52 +08:00

[Patch] Remove the patch of MiniCPM (#5975 )

2026-02-09 14:07:44 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #7 ) (#6023 )

2026-02-06 14:56:53 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #7 ) (#6023 )

2026-02-06 14:56:53 +08:00

[main][bugfix] Fix spec acceptance rate problem in vllm_0.15.0 (#6606 )

2026-02-09 21:33:58 +08:00

[refactor]Optimized the kvcache usage of Deepseek v3.2 (#6610 )

2026-02-09 18:53:56 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #10 ) (#6173 )

2026-02-06 15:35:06 +08:00

__init__.py

[Lint]Style: Convert vllm-ascend/compilation to ruff format (#5912 )

2026-01-16 20:57:46 +08:00

ascend_config.py

[Refactor] MLP weight prefetch to consistency with MoE Model's prefetching in terms of code and usage (#6442 )

2026-02-04 09:08:18 +08:00

ascend_forward_context.py

[main][bugfix] Fix spec acceptance rate problem in vllm_0.15.0 (#6606 )

2026-02-09 21:33:58 +08:00

batch_invariant.py

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #2 ) (#5977 )

2026-01-19 08:59:46 +08:00

cpu_binding.py

[Lint]Style: Convert vllm-ascend/compilation to ruff format (#5912 )

2026-01-16 20:57:46 +08:00

envs.py

[MISC] Clean up useless env USE_OPTIMIZED_MODEL (#6618 )

2026-02-09 15:38:58 +08:00

flash_common3_context.py

[Lint]Style: Convert vllm-ascend/compilation to ruff format (#5912 )

2026-01-16 20:57:46 +08:00

meta_registration.py

[Ops][Refactor] Remove custom rotary_embedding operator (#6523 )

2026-02-07 09:24:05 +08:00

platform.py

[Feat](sfa,dcp) support dcp for sfa (#6563 )

2026-02-09 18:52:25 +08:00

profiling_config.py

[Core][Misc] Clean up ProfileExecuteDuration (#6461 )

2026-02-01 20:06:01 +08:00

utils.py

[Feat.]: 310p support MOE models (#6530 )

2026-02-06 10:30:56 +08:00