From d9ad42a174d3160857a4585fc4ba42a542688243 Mon Sep 17 00:00:00 2001
From: Xinyu Dong <dongxinyu03@baidu.com>
Date: Sun, 15 Feb 2026 13:12:17 +0800
Subject: [PATCH] [Docs] Fix quantization support description in README (#208)

Updated quantization support description from FP8 to INT8.
---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 859a75a..8e7c0df 100644
--- a/README.md
+++ b/README.md
@@ -40,7 +40,7 @@ This plugin provides a hardware-pluggable interface that decouples the integrati
 
 - **Seamless Plugin Integration** — Works as a standard vLLM platform plugin via Python entry points, no need to modify vLLM source code
 - **Broad Model Support** — Supports 15+ mainstream LLMs including Qwen, Llama, DeepSeek, Kimi-K2, and multimodal models
-- **Quantization Support** — FP8 and other quantization methods for MoE and dense models
+- **Quantization Support** — INT8 and other quantization methods for MoE and dense models
 - **LoRA Fine-Tuning** — LoRA adapter support for Qwen series models
 - **Piecewise Kunlun Graph** — Hardware-accelerated graph optimization for high-performance inference
 - **FlashMLA Attention** — Optimized multi-head latent attention for DeepSeek MLA architectures