[Feature] support compressed-tensors w4a16 quantization (#154)
- native int4 kimi model inference is supported Signed-off-by: Li Wei <liwei.109@outlook.com>
This commit is contained in:
@@ -149,6 +149,14 @@ By utilizing the vLLM Kunlun plugin, popular open-source models, including Trans
|
||||
<td class="status-support">✅</td>
|
||||
<td></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="model-name">Kimi-K2</td>
|
||||
<td class="status-support">✅</td>
|
||||
<td class="status-support">✅</td>
|
||||
<td></td>
|
||||
<td class="status-support">✅</td>
|
||||
<td></td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
|
||||
|
||||
Reference in New Issue
Block a user