forked from EngineX-Cambricon/enginex-mlu370-vllm
add ops
This commit is contained in:
51
torch_mlu_ops-v1.3.2/benchmarks/README.md
Normal file
51
torch_mlu_ops-v1.3.2/benchmarks/README.md
Normal file
@@ -0,0 +1,51 @@
|
||||
## benchmark测试脚本使用方式
|
||||
|
||||
Torch-MLU-Ops benchmark测试脚本为用户提供了进行算子性能测试的便捷入口。
|
||||
用户可通过以下命令获取各个参数的含义。
|
||||
|
||||
```bash
|
||||
# 测试命令帮助
|
||||
python3 benchmark_xxx.py --help
|
||||
```
|
||||
各个参数含义如下:
|
||||
|
||||
`options`:
|
||||
- -h, --help show this help message and exit
|
||||
- --repeat_times REPEAT_TIMES repeat times for testing
|
||||
- --csv write the report data to csv
|
||||
- -o O specify the output folder name under --csv mode
|
||||
|
||||
```bash
|
||||
# 测试命令示例如下
|
||||
python3 benchmark_active.py --repeat_times 10 --csv -o './active/'
|
||||
```
|
||||
支持如下算子:
|
||||
|
||||
| op_name |
|
||||
| ---------------------------------|
|
||||
| active |
|
||||
| apply_rotary |
|
||||
| attention_project |
|
||||
| ffn |
|
||||
| flash_attn |
|
||||
| fused_layer_norm |
|
||||
| fused_moe |
|
||||
| fused_norm_attention_project |
|
||||
| fused_norm_residual_ffn |
|
||||
| fused_rms_norm |
|
||||
| group_gemm |
|
||||
| matmul |
|
||||
| offline_quant_to_linear_cache |
|
||||
| per_token_smooth_quantize |
|
||||
| preload |
|
||||
| quantize |
|
||||
| reshape_linear_cache |
|
||||
| quant_to_linear_cache |
|
||||
| reshape_paged_cache |
|
||||
| single_query_cached_kv_attn |
|
||||
| smooth_quant_matmul |
|
||||
| weight_only_quant_matmul |
|
||||
| moe_gen_idx |
|
||||
| moe_expand_input |
|
||||
| moe_softmax_topk |
|
||||
| moe_combine_result |
|
||||
Reference in New Issue
Block a user