enginex-vllm-bi100-qwen36

Author	SHA1	Message	Date
Lu Xinlong	b5806731e0	some op overhead optimization	2026-06-19 11:19:39 +08:00
Lu Xinlong	50e3a05fb0	fix incorrect MoE step to ensure decoding speed	2026-06-12 11:44:50 +08:00
Lu Xinlong	629f878c28	initial commit for qwen3.6-moe adaptation	2026-06-12 10:10:49 +08:00
Lu Xinlong	2d1ef50992	chunked prefill support and memory opts	2026-06-05 16:03:34 +08:00
Lu Xinlong	8c047a70ea	some modifications to ensure 50K context input	2026-06-04 17:56:29 +08:00
Lu Xinlong	3ef8227384	initial version of adding chunked attention, ensuring 20K context	2026-05-29 16:49:33 +08:00
Lu Xinlong	0e89906481	Qwen3.6-27B iluvatar bi-v100 adaptation	2026-05-21 16:37:24 +08:00