[Refactor]Refactor sampler (#2050)

Refactor Sampler implementation from patch way to inherit from vLLM
Sampler interface.

Next step: Make the op `TopKTopPSampler` in vLLM support custom ops
register mechanism

- vLLM version: v0.10.0
- vLLM main:
61a6905ab0

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
This commit is contained in:
wangxiyuan
2025-07-30 08:47:22 +08:00
committed by GitHub
parent b6a7f07c70
commit 9b67c87b14
8 changed files with 108 additions and 150 deletions

View File

@@ -88,21 +88,7 @@
# Future Plan:
# Remove this patch once pytorch 2.7.0 is supported for vllm ascend.
#
# ** File: worker/patch_common/patch_sampler.py **
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# 1. `vllm.v1.sample.sampler.Sampler.apply_top_k_top_p`
# Why:
# We need to use the patched `apply_top_k_top_p` in `sample`.
# The mainly reason to overwrite `apply_top_k_top_p` is
# to improve performance.
# How
# Re-implementation the `apply_top_k_top_p` function by pytorch
# Related PR (if no, explain why):
# - https://github.com/vllm-project/vllm-ascend/pull/1732
# Future Plan:
# Revert it when the ascend scatter performance improves.
#
# ** File: worker/patch_common/patch_sampler.py **
# ** File: worker/patch_0_10_0/patch_sampler_gather_logprobs.py **
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# 1. `vllm.v1.sample.sampler.Sampler.gather_logprobs`
# Why: