Files
sglang/sgl-router

SGLang Router

SGLang router is a standalone module implemented in Rust to achieve data parallelism across SGLang instances.

User docs

Please check https://docs.sglang.ai/router/router.html

Developer docs

Prerequisites

  • Rust and Cargo installed
# Install rustup (Rust installer and version manager)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Follow the installation prompts, then reload your shell
source $HOME/.cargo/env

# Verify installation
rustc --version
cargo --version
  • Python with pip installed

Build Process

1. Build Rust Project

$ cargo build

2. Build Python Binding

Option A: Build and Install Wheel
  1. Build the wheel package:
$ pip install setuptools-rust wheel build
$ python -m build
  1. Install the generated wheel:
$ pip install <path-to-wheel>

If you want one handy command to do build + install for every change you make:

$ python -m build && pip install --force-reinstall dist/*.whl
Option B: Development Mode

For development purposes, you can install the package in editable mode:

Warning: Using editable python binding can suffer from performance degradation!! Please build a fresh wheel for every update if you want to test performance.

$ pip install -e .

Note: When modifying Rust code, you must rebuild the wheel for changes to take effect.

Logging

The SGL Router includes structured logging with console output by default. To enable log files:

# Enable file logging when creating a router
router = Router(
    worker_urls=["http://worker1:8000", "http://worker2:8000"],
    log_dir="./logs"  # Daily log files will be created here
)

Use the --verbose flag with the CLI for more detailed logs.

Kubernetes Service Discovery

SGL Router supports automatic service discovery for worker nodes in Kubernetes environments. When enabled, the router will automatically:

  • Discover and add worker pods with matching labels
  • Remove unhealthy or deleted worker pods
  • Dynamically adjust the worker pool based on pod health and availability

Command Line Usage

python -m sglang_router.launch_router \
    --service-discovery \
    --selector app=sglang-worker role=inference \
    --service-discovery-port 8000 \
    --service-discovery-namespace default

Service Discovery Arguments

  • --service-discovery: Enable Kubernetes service discovery feature
  • --selector: One or more label key-value pairs for pod selection (format: key1=value1 key2=value2)
  • --service-discovery-port: Port to use when generating worker URLs (default: 80)
  • --service-discovery-namespace: Optional. Kubernetes namespace to watch for pods. If not provided, watches all namespaces (requires cluster-wide permissions)

RBAC Requirements

When using service discovery, you must configure proper Kubernetes RBAC permissions:

  • If using namespace-scoped discovery (with --service-discovery-namespace): Set up a ServiceAccount, Role, and RoleBinding

  • If watching all namespaces (without specifying namespace): Set up a ServiceAccount, ClusterRole, and ClusterRoleBinding with permissions to list/watch pods at the cluster level

Troubleshooting

  1. If rust analyzer is not working in VSCode, set rust-analyzer.linkedProjects to the absolute path of Cargo.toml in your repo. For example:
{
  "rust-analyzer.linkedProjects":  ["/workspaces/sglang/sgl-router/Cargo.toml"]
}

CI/CD Setup

The continuous integration pipeline consists of three main steps:

1. Build Wheels

  • Uses cibuildwheel to create manylinux x86_64 packages
  • Compatible with major Linux distributions (Ubuntu, CentOS, etc.)
  • Additional configurations can be added to support other OS/architectures
  • Reference: cibuildwheel documentation

2. Build Source Distribution

  • Creates a source distribution containing the raw, unbuilt code
  • Enables pip to build the package from source when prebuilt wheels are unavailable

3. Publish to PyPI

  • Uploads both wheels and source distribution to PyPI

The CI configuration is based on the tiktoken workflow.