sglang/python/sglang/README.md

# Code Structures

- `lang`: The frontend language.
- `srt`: The backend engine for running local models. (SRT = SGLang Runtime).
- `test`: The test utilities.
- `api.py`: The public APIs.
- `bench_offline_throughput.py`: Benchmark the throughput in the offline mode.
- `bench_one_batch.py`: Benchmark the latency of running a single static batch without a server.
- `bench_one_batch_server.py`: Benchmark the latency of running a single batch with a server.
- `bench_serving.py`: Benchmark online serving with dynamic requests.
- `check_env.py`: Check the environment variables and dependencies.
- `global_config.py`: The global configs and constants.
- `launch_server.py`: The entry point for launching the local server.
- `llama3_eval.py`: Evaluation of Llama 3 using the Meta Llama dataset.
- `utils.py`: Common utilities.
- `version.py`: Version info.
misc: use pip cache purge and add unit test ci (#871) 2024-08-02 03:12:20 +08:00			`# Code Structures`

Improve benchmark scripts & fix llava (#613) 2024-07-13 15:00:26 -07:00			- `lang`: The frontend language.
Update Readme (#660) Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com> 2024-07-19 09:54:01 -07:00			- `srt`: The backend engine for running local models. (SRT = SGLang Runtime).
[CI] merge all ci tests into one file (#1289) 2024-09-01 02:36:56 -07:00			- `test`: The test utilities.
			- `api.py`: The public APIs.
Rename sglang.bench_latency to sglang.bench_one_batch (#2118) 2024-11-21 20:07:48 -08:00			- `bench_offline_throughput.py`: Benchmark the throughput in the offline mode.
			- `bench_one_batch.py`: Benchmark the latency of running a single static batch without a server.
			- `bench_one_batch_server.py`: Benchmark the latency of running a single batch with a server.
Improve docs (#662) 2024-07-19 10:58:03 -07:00			- `bench_serving.py`: Benchmark online serving with dynamic requests.
Support penalty in overlap mode; return logprob with chunked prefill; improve benchmark scripts (#3988) Co-authored-by: SangBin Cho <rkooo567@gmail.com> Co-authored-by: dhou-xai <dhou@x.ai> Co-authored-by: Hanming Lu <hanming_lu@berkeley.edu> 2025-03-03 00:12:04 -08:00			- `check_env.py`: Check the environment variables and dependencies.
Improve benchmark scripts & fix llava (#613) 2024-07-13 15:00:26 -07:00			- `global_config.py`: The global configs and constants.
Balance test in CI (#1411) 2024-09-12 23:29:44 -07:00			- `launch_server.py`: The entry point for launching the local server.
update pr-test (#3663) 2025-02-18 17:23:43 +08:00			- `llama3_eval.py`: Evaluation of Llama 3 using the Meta Llama dataset.
Improve benchmark scripts & fix llava (#613) 2024-07-13 15:00:26 -07:00			- `utils.py`: Common utilities.
Support penalty in overlap mode; return logprob with chunked prefill; improve benchmark scripts (#3988) Co-authored-by: SangBin Cho <rkooo567@gmail.com> Co-authored-by: dhou-xai <dhou@x.ai> Co-authored-by: Hanming Lu <hanming_lu@berkeley.edu> 2025-03-03 00:12:04 -08:00			- `version.py`: Version info.