[main][feature] Support quarot for eagle3 without embedding (#7038)

### What this PR does / why we need it?
If some `eagle3` model without embed_tokens works with `quarot` target
model, the acceptence rate will drop.
We solve it in this PR.
The relative vllm pr is https://github.com/vllm-project/vllm/pull/36225.

- vLLM main:
4034c3d32e

Signed-off-by: drslark <slarksblood@qq.com>
This commit is contained in:
drslark
2026-03-09 10:43:06 +08:00
committed by GitHub
parent 737dfcf638
commit 6a7115fa0d
5 changed files with 148 additions and 82 deletions

View File

@@ -35,4 +35,4 @@ import vllm_ascend.patch.worker.patch_huanyuan_vl # noqa
import vllm_ascend.patch.worker.patch_routed_experts_capturer # noqa
import vllm_ascend.patch.worker.patch_npugraph_ex_triton # noqa
import vllm_ascend.patch.worker.patch_kimi_k25 # noqa
import vllm_ascend.patch.worker.patch_qwen3_quarot # noqa
import vllm_ascend.patch.worker.patch_draft_quarot # noqa