Sync from v0.13

2026-01-19 10:38:50 +08:00
parent b2ef04d792
commit 5aef6c175a
3714 changed files with 854317 additions and 89342 deletions
--- a/benchmarks/README.md
+++ b/benchmarks/README.md
@@ -1,8 +1,20 @@
-# Benchmarking vLLM
+# Benchmarks

-## Downloading the ShareGPT dataset
+This directory used to contain vLLM's benchmark scripts and utilities for performance testing and evaluation.

-You can download the dataset by running:
-```bash
-wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json
-```
+## Contents
+
+- **Serving benchmarks**: Scripts for testing online inference performance (latency, throughput)
+- **Throughput benchmarks**: Scripts for testing offline batch inference performance
+- **Specialized benchmarks**: Tools for testing specific features like structured output, prefix caching, long document QA, request prioritization, and multi-modal inference
+- **Dataset utilities**: Framework for loading and sampling from various benchmark datasets (ShareGPT, HuggingFace datasets, synthetic data, etc.)
+
+## Usage
+
+For detailed usage instructions, examples, and dataset information, see the [Benchmark CLI documentation](https://docs.vllm.ai/en/latest/contributing/benchmarks.html#benchmark-cli).
+
+For full CLI reference see:
+
+- <https://docs.vllm.ai/en/latest/cli/bench/latency.html>
+- <https://docs.vllm.ai/en/latest/cli/bench/serve.html>
+- <https://docs.vllm.ai/en/latest/cli/bench/throughput.html>