### What this PR does / why we need it?
This pr supports w8a8 on 300I Duo platform. The main change is to use
`npu_quant_grouped_matmul_dequant` to replace `npu_grouped_matmul`.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
offline inference on 310p runs normally.
---------
Signed-off-by: angazenn <zengyanjia@huawei.com>
Signed-off-by: tianyitang <tangtianyi4@huawei.com>
Co-authored-by: angazenn <zengyanjia@huawei.com>
Co-authored-by: tianyitang <tangtianyi4@huawei.com>
Add static build_info py file to show soc and sleep mode info. It helps
to make the code clean and the error info will be more friendly for
users
This PR also added the unit test for vllm_ascend/utils.py
This PR also added the base test class for all ut in tests/ut/base.py
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>