Files

lilinsiman 52863c4165 [Refactor][EAGLE] 2/N: load model and generate token (#5437 )

### What this PR does / why we need it?
1. Refactor eagle and mtp function: load_model and generate_token_ids
2. Remove redundant code in mtp and eagle file
3. Refactor the UT of file

2/N of Refactor and merge mtp and eagle
Relational RFC: https://github.com/vllm-project/vllm-ascend/issues/5467

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
ut and tests

- vLLM version: release/v0.13.0
- vLLM main:
81786c8774

---------

Signed-off-by: lilinsiman <lilinsiman@gmail.com>

2026-01-05 14:07:54 +08:00

371 B

Raw Blame History

Feature Guide

This section provides a detailed usage guide of vLLM Ascend features.

:::{toctree} :caption: Feature Guide :maxdepth: 1 graph_mode quantization sleep_mode structured_output lora eplb_swift_balancer netloader Multi_Token_Prediction dynamic_batch kv_pool external_dp large_scale_ep ucm_deployment Fine_grained_TP speculative_decoding context_parallel :::

371 B Raw Blame History

Feature Guide

371 B

Raw Blame History