[doc] Add engine section in backend.md (#1656)

2024-10-13 00:33:58 -07:00
parent 7ee6c259ff
commit c3f2fc5a7a
1 changed files with 9 additions and 0 deletions
--- a/docs/en/backend.md
+++ b/docs/en/backend.md
@@ -93,6 +93,15 @@ python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B-Instruct
 python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B-Instruct --tp 4 --nccl-init sgl-dev-0:50000 --nnodes 2 --node-rank 1
 ```

+### SRT Engine: Direct Inference Without HTTP
+
+SGLang provides a direct inference engine **without an HTTP server**. This can be used for:
+
+1. **Offline Batch Inference**
+2. **Building Custom Servers**
+
+We provide usage examples [here](https://github.com/sgl-project/sglang/tree/main/examples/runtime/engine)
+
 ### Supported Models

 **Generative Models**