Add an example of using deepseekv3 int8 sglang. (#4177)

Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
This commit is contained in:
lukec
2025-03-07 17:56:09 +08:00
committed by GitHub
parent 7e3bb52705
commit ffa1b3e318
2 changed files with 22 additions and 0 deletions

View File

@@ -17,6 +17,7 @@ SGLang is recognized as one of the top engines for [DeepSeek model inference](ht
| | 4 x 8 x A100/A800 |
| **Quantized weights (AWQ)** | 8 x H100/800/20 |
| | 8 x A100/A800 |
| **Quantized weights (int8)** | 16 x A100/800 |
<style>
.md-typeset__table {
@@ -54,6 +55,7 @@ Detailed commands for reference:
- [2 x 8 x H200](https://github.com/sgl-project/sglang/tree/main/benchmark/deepseek_v3#example-serving-with-two-h208-nodes)
- [4 x 8 x A100](https://github.com/sgl-project/sglang/tree/main/benchmark/deepseek_v3#example-serving-with-four-a1008-nodes)
- [8 x A100 (AWQ)](https://github.com/sgl-project/sglang/tree/main/benchmark/deepseek_v3#example-serving-with-8-a100a800-with-awq-quantization)
- [16 x A100 (int8)](https://github.com/sgl-project/sglang/tree/modify-doc/benchmark/deepseek_v3#example-serving-with-16-a100a800-with-int8-quantization)
### Download Weights