release initial code
Co-authored-by: Ying Sheng <sqy1415@gmail.com> Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com> Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu> Co-authored-by: parasol-aser <3848358+parasol-aser@users.noreply.github.com> Co-authored-by: LiviaSun <33578456+ChuyueSun@users.noreply.github.com> Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>
This commit is contained in:
37
benchmark/line_retrieval/README.md
Normal file
37
benchmark/line_retrieval/README.md
Normal file
@@ -0,0 +1,37 @@
|
||||
## Download data
|
||||
|
||||
```
|
||||
wget https://raw.githubusercontent.com/merrymercy/merrymercy.github.io/master/files/random_words.json
|
||||
python3 gen_data.py --number 1000
|
||||
```
|
||||
|
||||
## Run benchmark
|
||||
|
||||
### Benchmark sglang
|
||||
```
|
||||
python3 -m sglang.launch_server --model-path codellama/CodeLlama-7b-hf --port 30000
|
||||
```
|
||||
|
||||
```
|
||||
python3 bench_sglang.py --src-index 600 --num-q 50 --parallel 1
|
||||
```
|
||||
|
||||
|
||||
###
|
||||
|
||||
```
|
||||
# original
|
||||
Accuracy: 0.940, latency: 332.83 s
|
||||
|
||||
# parallel encoding (no_adjust, offset = 1000)
|
||||
Accuracy: 0.760, latency: 238.46 s
|
||||
|
||||
# parallel encoding (no_adjust, offset = 3000)
|
||||
Accuracy: 0.760, latency: 238.46 s
|
||||
|
||||
# parallel encoding (no_adjust, offset = 0)
|
||||
Accuracy: 0.520, latency: 238.46 s
|
||||
|
||||
# parallel encoding (adjust_cache)
|
||||
Accuracy: 0.460, latency: 257.66 s
|
||||
```
|
||||
Reference in New Issue
Block a user