forked from EngineX-Cambricon/enginex-mlu370-vllm
update readme
This commit is contained in:
@@ -4,7 +4,7 @@
|
||||
|
||||
## 版本更新记录
|
||||
|
||||
**v0.0.6.2** — 2026-02-11 · Llama4 模型支持,含 sigmoid routing MoE、QK Norm、交替 dense/MoE 层;由于 MLU370(capability=3)限制,MoE 改为 dense 模式解决 graph capture 兼容性(⚠️ 计算量增大,DeepSeek V2/V3 不受影响)
|
||||
**v0.0.6.2** — 2026-02-11 · Llama4 模型支持,含 sigmoid routing MoE、QK Norm、交替 dense/MoE 层;由于 MLU370(capability=3)限制,MoE 改为 dense 模式解决 graph capture 兼容性
|
||||
|
||||
**v0.0.6.1** — 2026-02-11 · DeepSeek V3 MTP 推测解码,新建 MTP draft model 复用 DeepseekV2DecoderLayer,自动检测并启用 MTP speculative decoding
|
||||
|
||||
|
||||
Reference in New Issue
Block a user