xc-llm-ascend/benchmarks/tests/serving-tests.json

[
  {
    "test_name": "serving_llama8B_tp1",
    "qps_list": [
      1,
      4,
      16,
      "inf"
    ],
    "server_parameters": {
      "model": "LLM-Research/Meta-Llama-3.1-8B-Instruct",
      "tensor_parallel_size": 1,
      "swap_space": 16,
      "disable_log_stats": "",
      "disable_log_requests": "",
      "load_format": "dummy"
    },
    "client_parameters": {
      "model": "LLM-Research/Meta-Llama-3.1-8B-Instruct",
      "backend": "vllm",
      "dataset_name": "sharegpt",
      "dataset_path": "./ShareGPT_V3_unfiltered_cleaned_split.json",
      "num_prompts": 200
    }
  },
  {
    "test_name": "serving_qwen2_5_7B_tp1",
    "qps_list": [
      1,
      4,
      16,
      "inf"
    ],
    "server_parameters": {
      "model": "Qwen/Qwen2.5-7B-Instruct",
      "tensor_parallel_size": 1,
      "swap_space": 16,
      "disable_log_stats": "",
      "disable_log_requests": "",
      "load_format": "dummy"
    },
    "client_parameters": {
      "model": "Qwen/Qwen2.5-7B-Instruct",
      "backend": "vllm",
      "dataset_name": "sharegpt",
      "dataset_path": "./ShareGPT_V3_unfiltered_cleaned_split.json",
      "num_prompts": 200
    }
  }
]
[Doc]Add benchmark scripts (#74) ### What this PR does / why we need it? The purpose of this PR is to add benchmark scripts for npu, developers can easily run performance tests on their own machines with one line of code . --------- Signed-off-by: wangli <wangli858794774@gmail.com> 2025-03-21 15:54:34 +08:00			`[`
			`{`
			`"test_name": "serving_llama8B_tp1",`
			`"qps_list": [`
			`1,`
			`4,`
			`16,`
			`"inf"`
			`],`
			`"server_parameters": {`
[Benchmark] Download model from modelscope (#634) ### What this PR does / why we need it? - Run benchmark scripts will Download model from modelscope Signed-off-by: wangli <wangli858794774@gmail.com> 2025-04-24 14:48:24 +08:00			`"model": "LLM-Research/Meta-Llama-3.1-8B-Instruct",`
[Doc]Add benchmark scripts (#74) ### What this PR does / why we need it? The purpose of this PR is to add benchmark scripts for npu, developers can easily run performance tests on their own machines with one line of code . --------- Signed-off-by: wangli <wangli858794774@gmail.com> 2025-03-21 15:54:34 +08:00			`"tensor_parallel_size": 1,`
			`"swap_space": 16,`
			`"disable_log_stats": "",`
			`"disable_log_requests": "",`
			`"load_format": "dummy"`
			`},`
			`"client_parameters": {`
[Benchmark] Download model from modelscope (#634) ### What this PR does / why we need it? - Run benchmark scripts will Download model from modelscope Signed-off-by: wangli <wangli858794774@gmail.com> 2025-04-24 14:48:24 +08:00			`"model": "LLM-Research/Meta-Llama-3.1-8B-Instruct",`
[Doc]Add benchmark scripts (#74) ### What this PR does / why we need it? The purpose of this PR is to add benchmark scripts for npu, developers can easily run performance tests on their own machines with one line of code . --------- Signed-off-by: wangli <wangli858794774@gmail.com> 2025-03-21 15:54:34 +08:00			`"backend": "vllm",`
			`"dataset_name": "sharegpt",`
			`"dataset_path": "./ShareGPT_V3_unfiltered_cleaned_split.json",`
			`"num_prompts": 200`
			`}`
[Benchmarks] Add qwen2.5-7b test (#763) ### What this PR does / why we need it? - Add qwen2.5-7b test - Optimize the documentation to be more developer-friendly Signed-off-by: xuedinge233 <damow890@gmail.com> Co-authored-by: xuedinge233 <damow890@gmail.com> 2025-05-10 09:47:42 +08:00			`},`
			`{`
			`"test_name": "serving_qwen2_5_7B_tp1",`
			`"qps_list": [`
			`1,`
			`4,`
			`16,`
			`"inf"`
			`],`
			`"server_parameters": {`
			`"model": "Qwen/Qwen2.5-7B-Instruct",`
			`"tensor_parallel_size": 1,`
			`"swap_space": 16,`
			`"disable_log_stats": "",`
			`"disable_log_requests": "",`
			`"load_format": "dummy"`
			`},`
			`"client_parameters": {`
			`"model": "Qwen/Qwen2.5-7B-Instruct",`
			`"backend": "vllm",`
			`"dataset_name": "sharegpt",`
			`"dataset_path": "./ShareGPT_V3_unfiltered_cleaned_split.json",`
			`"num_prompts": 200`
			`}`
[Doc]Add benchmark scripts (#74) ### What this PR does / why we need it? The purpose of this PR is to add benchmark scripts for npu, developers can easily run performance tests on their own machines with one line of code . --------- Signed-off-by: wangli <wangli858794774@gmail.com> 2025-03-21 15:54:34 +08:00			`}`
			`]`