[doc] Add engine section in backend.md (#1656)
This commit is contained in:
@@ -93,6 +93,15 @@ python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B-Instruct
|
|||||||
python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B-Instruct --tp 4 --nccl-init sgl-dev-0:50000 --nnodes 2 --node-rank 1
|
python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B-Instruct --tp 4 --nccl-init sgl-dev-0:50000 --nnodes 2 --node-rank 1
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### SRT Engine: Direct Inference Without HTTP
|
||||||
|
|
||||||
|
SGLang provides a direct inference engine **without an HTTP server**. This can be used for:
|
||||||
|
|
||||||
|
1. **Offline Batch Inference**
|
||||||
|
2. **Building Custom Servers**
|
||||||
|
|
||||||
|
We provide usage examples [here](https://github.com/sgl-project/sglang/tree/main/examples/runtime/engine)
|
||||||
|
|
||||||
### Supported Models
|
### Supported Models
|
||||||
|
|
||||||
**Generative Models**
|
**Generative Models**
|
||||||
|
|||||||
Reference in New Issue
Block a user