[Refactor][EAGLE] 2/N: load model and generate token (#5437)
### What this PR does / why we need it?
1. Refactor eagle and mtp function: load_model and generate_token_ids
2. Remove redundant code in mtp and eagle file
3. Refactor the UT of file
2/N of Refactor and merge mtp and eagle
Relational RFC: https://github.com/vllm-project/vllm-ascend/issues/5467
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
ut and tests
- vLLM version: release/v0.13.0
- vLLM main:
81786c8774
---------
Signed-off-by: lilinsiman <lilinsiman@gmail.com>
This commit is contained in:
@@ -12,6 +12,7 @@ structured_output
|
||||
lora
|
||||
eplb_swift_balancer
|
||||
netloader
|
||||
Multi_Token_Prediction
|
||||
dynamic_batch
|
||||
kv_pool
|
||||
external_dp
|
||||
|
||||
Reference in New Issue
Block a user