enginex-vllm-bi100-qwen36

Author	SHA1	Message	Date
Lu Xinlong	365da18436	Add reasoning parser mechanism + qwen3 parser + bugfixes	2026-06-10 18:22:29 +08:00
Lu Xinlong	4ab36b51d5	Add qwen3_coder tool calling parser	2026-06-10 14:38:54 +08:00
Lu Xinlong	d972854fb7	fix completion token statistic bug when input context is large	2026-06-08 15:04:34 +08:00
Lu Xinlong	c2de1c83b0	Utilize chunked prefill + K-tiling techniques to ensure 100K context	2026-06-05 17:00:41 +08:00
Lu Xinlong	2d1ef50992	chunked prefill support and memory opts	2026-06-05 16:03:34 +08:00
Lu Xinlong	8c047a70ea	some modifications to ensure 50K context input	2026-06-04 17:56:29 +08:00
Lu Xinlong	1c33ef1355	add paged_attn	2026-05-29 16:53:39 +08:00
Lu Xinlong	3ef8227384	initial version of adding chunked attention, ensuring 20K context	2026-05-29 16:49:33 +08:00
Lu Xinlong	0e89906481	Qwen3.6-27B iluvatar bi-v100 adaptation	2026-05-21 16:37:24 +08:00