adapt to sglang v0.5.2rc1 on dcu

2025-09-04 15:56:33 +08:00
commit 909abb58f5
2320 changed files with 489411 additions and 0 deletions
--- a/benchmark/line_retrieval/README.md
+++ b/benchmark/line_retrieval/README.md
@@ -0,0 +1,37 @@
+## Download data
+
+```
+wget https://raw.githubusercontent.com/merrymercy/merrymercy.github.io/master/files/random_words.json
+python3 gen_data.py --number 1000
+```
+
+## Run benchmark
+
+### Benchmark sglang
+```
+python3 -m sglang.launch_server --model-path codellama/CodeLlama-7b-hf --port 30000
+```
+
+```
+python3 bench_sglang.py --src-index 600 --num-q 50 --parallel 1
+```
+
+
+###
+
+```
+# original
+Accuracy: 0.940, latency: 332.83 s
+
+# parallel encoding (no_adjust, offset = 1000)
+Accuracy: 0.760, latency: 238.46 s
+
+# parallel encoding (no_adjust, offset = 3000)
+Accuracy: 0.760, latency: 238.46 s
+
+# parallel encoding (no_adjust, offset = 0)
+Accuracy: 0.520, latency: 238.46 s
+
+# parallel encoding (adjust_cache)
+Accuracy: 0.460, latency: 257.66 s
+```