Files

zzhxxx 17f2eead99 [Doc]Add the user_guide doc file regarding fine-grained TP. (#5084 )

### What this PR does / why we need it?
Add user guide for **Fine-Grained Tensor Parallelism** feature.  
Documents usage, supported components (`embedding`, `lm_head`, `o_proj`,
`mlp`/`dense_ffn`), model compatibility, and deployment guidelines.

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: zzhx1 <zzh_201018@outlook.com>
Signed-off-by: chenxiao <Jaychou1620@Gmail.com>
Signed-off-by: 秋刀鱼 <jaychou1620@Gmail.com>
Co-authored-by: chenxiao <Jaychou1620@Gmail.com>
Co-authored-by: Jade Zheng <zheng.shoujian@outlook.com>

2025-12-19 16:37:25 +08:00

359 B

Raw Blame History

Feature Guide

This section provides a detailed usage guide of vLLM Ascend features.

:::{toctree} :caption: Feature Guide :maxdepth: 1 graph_mode quantization quantization-llm-compressor sleep_mode structured_output lora eplb_swift_balancer netloader dynamic_batch kv_pool external_dp large_scale_ep ucm_deployment Fine_grained_TP speculative_decoding :::

359 B Raw Blame History

Feature Guide

359 B

Raw Blame History