[Info][main] Corrected the errors in the information (#4055)
### What this PR does / why we need it?
Corrected the errors in the information
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
ut
- vLLM version: v0.11.0
- vLLM main:
83f478bb19
Signed-off-by: lilinsiman <lilinsiman@gmail.com>
This commit is contained in:
@@ -51,7 +51,7 @@ From the workflow perspective, we can see how the final test script is executed,
|
||||
# - no headless(have api server)
|
||||
decoder_host_index: [1]
|
||||
|
||||
# Add each node's vllm serve cli command just like you runs locally
|
||||
# Add each node's vllm serve cli command just like you run locally
|
||||
deployment:
|
||||
-
|
||||
server_cmd: >
|
||||
|
||||
@@ -70,7 +70,7 @@ Make sure your vLLM and vllm-ascend are installed after your python configuratio
|
||||
|
||||
#### 1.1. Install optimized `python`
|
||||
|
||||
Python supports **LTO** and **PGO** optimization starting from version `3.6` and above, which can be enabled at compile time. And we have offered optimized `python` packages directly to users for the sake of convenience. You can also reproduce the `python` built following this [tutorial](https://www.hiascend.com/document/detail/zh/Pytorch/600/ptmoddevg/trainingmigrguide/performance_tuning_0063.html) according to your specific scenarios.
|
||||
Python supports **LTO** and **PGO** optimization starting from version `3.6` and above, which can be enabled at compile time. And we have offered optimized `python` packages directly to users for the sake of convenience. You can also reproduce the `python` build following this [tutorial](https://www.hiascend.com/document/detail/zh/Pytorch/600/ptmoddevg/trainingmigrguide/performance_tuning_0063.html) according to your specific scenarios.
|
||||
|
||||
```{code-block} bash
|
||||
:substitutions:
|
||||
@@ -116,7 +116,7 @@ export LD_PRELOAD=/usr/lib/"$(uname -i)"-linux-gnu/libjemalloc.so.2 $LD_PRELOAD
|
||||
|
||||
#### 2.2. Tcmalloc
|
||||
|
||||
**Tcmalloc (Thread Counting Malloc)** is a universal memory allocator that improves overall performance while ensuring low latency by introducing a multi-level cache structure, reducing mutex competition and optimizing large object processing flow. Find more details [here](https://www.hiascend.com/document/detail/zh/Pytorch/700/ptmoddevg/trainingmigrguide/performance_tuning_0068.html).
|
||||
**Tcmalloc (Thread Caching Malloc)** is a universal memory allocator that improves overall performance while ensuring low latency by introducing a multi-level cache structure, reducing mutex competition and optimizing large object processing flow. Find more details [here](https://www.hiascend.com/document/detail/zh/Pytorch/700/ptmoddevg/trainingmigrguide/performance_tuning_0068.html).
|
||||
|
||||
```{code-block} bash
|
||||
:substitutions:
|
||||
|
||||
Reference in New Issue
Block a user