### What this PR does / why we need it?
1. Refactor eagle and mtp function: load_model and generate_token_ids
2. Remove redundant code in mtp and eagle file
3. Refactor the UT of file
2/N of Refactor and merge mtp and eagle
Relational RFC: https://github.com/vllm-project/vllm-ascend/issues/5467
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
ut and tests
- vLLM version: release/v0.13.0
- vLLM main:
81786c8774
---------
Signed-off-by: lilinsiman <lilinsiman@gmail.com>
18 lines
380 B
Markdown
18 lines
380 B
Markdown
# Feature Guide
|
|
|
|
This section provides an overview of the features implemented in vLLM Ascend. Developers can refer to this guide to understand how vLLM Ascend works.
|
|
|
|
:::{toctree}
|
|
:caption: Feature Guide
|
|
:maxdepth: 1
|
|
patch
|
|
ModelRunner_prepare_inputs
|
|
disaggregated_prefill
|
|
eplb_swift_balancer.md
|
|
ACL_Graph
|
|
KV_Cache_Pool_Guide
|
|
add_custom_aclnn_op
|
|
context_parallel
|
|
quantization
|
|
:::
|