[Feature] support compressed-tensors w4a16 quantization (#154)

- native int4 kimi model inference is supported

Signed-off-by: Li Wei <liwei.109@outlook.com>
This commit is contained in:
Li Wei
2026-01-27 19:56:22 +08:00
committed by GitHub
parent 0711c1abfa
commit 71bd70ad6c
9 changed files with 369 additions and 28 deletions

View File

@@ -149,6 +149,14 @@ By utilizing the vLLM Kunlun plugin, popular open-source models, including Trans
<td class="status-support"></td>
<td></td>
</tr>
<tr>
<td class="model-name">Kimi-K2</td>
<td class="status-support"></td>
<td class="status-support"></td>
<td></td>
<td class="status-support"></td>
<td></td>
</tr>
</tbody>
</table>