From c3f2fc5a7a152a3679e753dcd023f38ef2458676 Mon Sep 17 00:00:00 2001
From: Byron Hsu <byronhsu1230@gmail.com>
Date: Sun, 13 Oct 2024 00:33:58 -0700
Subject: [PATCH] [doc] Add engine section in backend.md (#1656)

---
 docs/en/backend.md | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/docs/en/backend.md b/docs/en/backend.md
index 020848ba7..d19eb062a 100644
--- a/docs/en/backend.md
+++ b/docs/en/backend.md
@@ -93,6 +93,15 @@ python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B-Instruct
 python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B-Instruct --tp 4 --nccl-init sgl-dev-0:50000 --nnodes 2 --node-rank 1
 ```
 
+### SRT Engine: Direct Inference Without HTTP
+
+SGLang provides a direct inference engine **without an HTTP server**. This can be used for:
+
+1. **Offline Batch Inference**
+2. **Building Custom Servers**
+
+We provide usage examples [here](https://github.com/sgl-project/sglang/tree/main/examples/runtime/engine)
+
 ### Supported Models
 
 **Generative Models**