### What this PR does / why we need it?
This PR makes the following modifications:
1.delete the `user_guide/feature_guide/quantization-llm-compressor.md`
and merge it into `user_guide/feature_guide/quantization.md`.
2.update the content of `user_guide/feature_guide/quantization.md`.
3.add guidance `developer_guide/feature_guide/quantization.md' on the
adaptation of quantization algorithms and quantized models.
### Does this PR introduce _any_ user-facing change?
N/A
### How was this patch tested?
- vLLM version: v0.13.0
- vLLM main:
7157596103
---------
Signed-off-by: IncSec <1790766300@qq.com>
Signed-off-by: InSec <1790766300@qq.com>
403 B
403 B
Feature Guide
This section provides an overview of the features implemented in vLLM Ascend. Developers can refer to this guide to understand how vLLM Ascend works.
:::{toctree} :caption: Feature Guide :maxdepth: 1 patch ModelRunner_prepare_inputs disaggregated_prefill eplb_swift_balancer.md Multi_Token_Prediction ACL_Graph KV_Cache_Pool_Guide add_custom_aclnn_op context_parallel quantization :::