[Doc][Misc] Correcting the document and uploading the model deployment template (#8287)
<!-- Thanks for sending a pull request! BEFORE SUBMITTING, PLEASE READ https://docs.vllm.ai/en/latest/contributing/overview.html --> ### What this PR does / why we need it? Correcting the document and uploading the model deployment template ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? --------- Signed-off-by: herizhen <1270637059@qq.com> Signed-off-by: herizhen <59841270+herizhen@users.noreply.github.com>
This commit is contained in:
@@ -259,7 +259,7 @@ The performance of `torch_npu.npu_fused_infer_attention_score` in small batch sc
|
||||
|
||||
```bash
|
||||
bash tools/install_flash_infer_attention_score_ops_a2.sh
|
||||
## change to run the following instruction if you're using A3 machine
|
||||
# change to run the following instruction if you're using A3 machine
|
||||
# bash tools/install_flash_infer_attention_score_ops_a3.sh
|
||||
```
|
||||
|
||||
|
||||
Reference in New Issue
Block a user