[Doc][Misc] Correcting the document and uploading the model deployment template (#8287)

### What this PR does / why we need it? Correcting the document and uploading the model deployment template ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? --------- Signed-off-by: herizhen <1270637059@qq.com> Signed-off-by: herizhen <59841270+herizhen@users.noreply.github.com>
2026-04-15 16:03:11 +08:00
parent 147b589f62
commit 95726d20eb
31 changed files with 536 additions and 308 deletions
--- a/docs/source/developer_guide/Design_Documents/quantization.md
+++ b/docs/source/developer_guide/Design_Documents/quantization.md
@@ -54,7 +54,7 @@ Based on the above content, we present a brief description of the adaptation pro
 - **Step 2: Registration**. Use the `@register_scheme` decorator in `vllm_ascend/quantization/methods/registry.py` to register your quantization scheme class.

 ```python
-from vllm_ascend.quantization.methods import register_scheme, AscendLinearScheme
+from vllm_ascend.quantization.methods import register_scheme, AscendLinearScheme, AscendMoEScheme

@register_scheme("W4A8_DYNAMIC", "linear")
 class AscendW4A8DynamicLinearMethod(AscendLinearScheme):