xc-llm-ascend/benchmarks/tests/latency-tests.json

[
  {
    "test_name": "latency_llama8B_tp1",
    "parameters": {
      "model": "LLM-Research/Meta-Llama-3.1-8B-Instruct",
      "tensor_parallel_size": 1,
      "load_format": "dummy",
      "num_iters_warmup": 5,
      "num_iters": 15
    }
  }
]
[Doc]Add benchmark scripts (#74) ### What this PR does / why we need it? The purpose of this PR is to add benchmark scripts for npu, developers can easily run performance tests on their own machines with one line of code . --------- Signed-off-by: wangli <wangli858794774@gmail.com> 2025-03-21 15:54:34 +08:00			`[`
			`{`
			`"test_name": "latency_llama8B_tp1",`
			`"parameters": {`
[Benchmark] Download model from modelscope (#634) ### What this PR does / why we need it? - Run benchmark scripts will Download model from modelscope Signed-off-by: wangli <wangli858794774@gmail.com> 2025-04-24 14:48:24 +08:00			`"model": "LLM-Research/Meta-Llama-3.1-8B-Instruct",`
[Doc]Add benchmark scripts (#74) ### What this PR does / why we need it? The purpose of this PR is to add benchmark scripts for npu, developers can easily run performance tests on their own machines with one line of code . --------- Signed-off-by: wangli <wangli858794774@gmail.com> 2025-03-21 15:54:34 +08:00			`"tensor_parallel_size": 1,`
			`"load_format": "dummy",`
			`"num_iters_warmup": 5,`
			`"num_iters": 15`
			`}`
			`}`
			`]`