[doc] update quantization guide doc (#88)

This commit is contained in:
Li Wei
2026-01-07 15:39:51 +08:00
committed by GitHub
parent eb40e8a07a
commit c403d921ff
2 changed files with 52 additions and 21 deletions

View File

@@ -2,14 +2,14 @@
## Generative Models
| Model | Support | W8A8 | LoRA | Tensor Parallel | Expert Parallel | Data Parallel | Piecewise Kunlun Graph |
| :------------ | :------------ | :--- | :--- | :-------------- | :-------------- | :------------ | :--------------------- |
| Qwen3 | ✅ | | ✅ | ✅ | | ✅ | ✅ |
| Qwen3-Moe | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Qwen3-Next | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Model | Support | W8A8 | LoRA | Tensor Parallel | Expert Parallel | Data Parallel | Piecewise Kunlun Graph |
| :------------ | :------ | :--- | :--- | :-------------- | :-------------- | :------------ | :--------------------- |
| Qwen3 | ✅ | | ✅ | ✅ | | ✅ | ✅ |
| Qwen3-Moe | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Qwen3-Next | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Deepseek v3.2 | ✅ | ✅ | | ✅ | | ✅ | ✅ |
## Multimodal Language Models
| Model | Support | W8A8 | LoRA | Tensor Parallel | Expert Parallel | Data Parallel | Piecewise Kunlun Graph |
| :----------- | :------------ | :--- | :--- | :-------------- | :-------------- | :------------ | :--------------------- |
| Qwen3-VL | ✅ | | | ✅ | | ✅ | ✅ |
| Model | Support | W8A8 | LoRA | Tensor Parallel | Expert Parallel | Data Parallel | Piecewise Kunlun Graph |
| :------- | :------ | :--- | :--- | :-------------- | :-------------- | :------------ | :--------------------- |
| Qwen3-VL | ✅ | | | ✅ | | ✅ | ✅ |