Commit Graph

5 Commits

Author SHA1 Message Date
22dimensions
c464c32b81 add doc for offline quantization inference (#1009)
add example for offline inference with quantized model

Signed-off-by: 22dimensions <waitingwind@foxmail.com>
2025-05-29 17:32:42 +08:00
22dimensions
d5401a08be [DOC] update modelslim version (#908)
1. update modelslim version to fix deepseek related issues
2. add note for "--quantization ascend"

Signed-off-by: 22dimensions <waitingwind@foxmail.com>
2025-05-21 09:12:02 +08:00
22dimensions
a8730e7a3c [Doc] update quantization docs with QwQ-32B-W8A8 example (#835)
1. replace deepseek-v2-lite model with more pratical model QwQ 32B
2. fix some incorrect commands
3. replase modelslim version with a more formal tag

Signed-off-by: 22dimensions <waitingwind@foxmail.com>
2025-05-17 15:25:17 +08:00
wangxiyuan
6193ba679b [CI] add codespell CI and fix format.sh (#827)
1. Fix format check error to make format.sh work
2. Add codespell check CI 
3. Add the missing required package for vllm-ascend.

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-05-12 22:04:48 +08:00
Li Wang
d0a0c81ced [Doc] Add deepsee-v2-lite w8a8 quantization turorial (#630)
### What this PR does / why we need it?
Add deepsee-v2-lite w8a8 quantization turorial

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-04-28 17:14:26 +08:00