[Doc] Add reinstall instructions doc (#1303)
Add a new FAQ, if users re-install vllm-ascend with pip, the `build` folder should be removed first --------- Signed-off-by: rjg-lyh <1318825571@qq.com> Signed-off-by: weiguihua <weiguihua2@huawei.com> Signed-off-by: weiguihua2 <weiguihua2@huawei.com>
This commit is contained in:
@@ -114,7 +114,7 @@ In scenarios where NPUs have limited HBM (High Bandwidth Memory) capacity, dynam
|
||||
|
||||
- **Configure `PYTORCH_NPU_ALLOC_CONF`**: Set this environment variable to optimize NPU memory management. For example, you can `export PYTORCH_NPU_ALLOC_CONF=expandable_segments:True` to enable virtual memory feature to mitigate memory fragmentation caused by frequent dynamic memory size adjustments during runtime, see more note in: [PYTORCH_NPU_ALLOC_CONF](https://www.hiascend.com/document/detail/zh/Pytorch/700/comref/Envvariables/Envir_012.html).
|
||||
|
||||
### 15. Failed to enable NPU graph mode when running DeepSeek?
|
||||
### 16. Failed to enable NPU graph mode when running DeepSeek?
|
||||
You may encounter the following error if running DeepSeek with NPU graph mode enabled. The allowed number of queries per kv when enabling both MLA and Graph mode only support {32, 64, 128}, **Thus this is not supported for DeepSeek-V2-Lite**, as it only has 16 attention heads. The NPU graph mode support on DeepSeek-V2-Lite will be done in the future.
|
||||
|
||||
And if you're using DeepSeek-V3 or DeepSeek-R1, please make sure after the tensor parallel split, num_heads / num_kv_heads in {32, 64, 128}.
|
||||
@@ -123,3 +123,6 @@ And if you're using DeepSeek-V3 or DeepSeek-R1, please make sure after the tenso
|
||||
[rank0]: RuntimeError: EZ9999: Inner Error!
|
||||
[rank0]: EZ9999: [PID: 62938] 2025-05-27-06:52:12.455.807 numHeads / numKvHeads = 8, MLA only support {32, 64, 128}.[FUNC:CheckMlaAttrs][FILE:incre_flash_attention_tiling_check.cc][LINE:1218]
|
||||
```
|
||||
|
||||
### 17. Failed to reinstall vllm-ascend from source after uninstalling vllm-ascend?
|
||||
You may encounter the problem of C compilation failure when reinstalling vllm-ascend from source using pip. If the installation fails, it is recommended to use `python setup.py install` to install, or use `python setup.py clean` to clear the cache.
|
||||
|
||||
Reference in New Issue
Block a user