enginex-vllm-bi100-qwen36

Author	SHA1	Message	Date
Lu Xinlong	d972854fb7	fix completion token statistic bug when input context is large	2026-06-08 15:04:34 +08:00
Lu Xinlong	c2de1c83b0	Utilize chunked prefill + K-tiling techniques to ensure 100K context	2026-06-05 17:00:41 +08:00
Lu Xinlong	2d1ef50992	chunked prefill support and memory opts	2026-06-05 16:03:34 +08:00
Lu Xinlong	8c047a70ea	some modifications to ensure 50K context input	2026-06-04 17:56:29 +08:00
Lu Xinlong	1c33ef1355	add paged_attn	2026-05-29 16:53:39 +08:00
Lu Xinlong	3ef8227384	initial version of adding chunked attention, ensuring 20K context	2026-05-29 16:49:33 +08:00
Lu Xinlong	0e89906481	Qwen3.6-27B iluvatar bi-v100 adaptation	2026-05-21 16:37:24 +08:00
Li Jiashu	fad74b701b	Update to new version of base image	2025-10-24 15:45:06 +08:00
Zhang Hao	ee04aead1e	add dataset and more models	2025-10-17 16:52:12 +08:00
Li Jiashu	8f07ba339a	Update README	2025-08-29 15:40:07 +08:00
zhousha	37e89f390e	update Dockerfile images	2025-08-25 14:19:36 +08:00
Li Jiashu	99fb9f5cb0	First commit	2025-08-05 19:02:46 +08:00
lumian	9efe891f99	添加 README.md	2025-08-04 16:57:34 +08:00