[Feature] support compressed-tensors w4a16 quantization (#154)

- native int4 kimi model inference is supported Signed-off-by: Li Wei <liwei.109@outlook.com>
2026-01-27 19:56:22 +08:00
parent 0711c1abfa
commit 71bd70ad6c
9 changed files with 369 additions and 28 deletions
--- a/README.md
+++ b/README.md
@@ -149,6 +149,14 @@ By utilizing the vLLM Kunlun plugin, popular open-source models, including Trans
      <td class="status-support">✅</td>
      <td></td>
    </tr>
+    <tr>
+      <td class="model-name">Kimi-K2</td>
+      <td class="status-support">✅</td>
+      <td class="status-support">✅</td>
+      <td></td>
+      <td class="status-support">✅</td>
+      <td></td>
+    </tr>
  </tbody>
 </table>