sglang

Author	SHA1	Message	Date
blzheng	d1d4074c4e	[CPU] Add gelu_and_mul kernel in sgl-kernel and add ut (#9300 )	2025-09-08 23:23:13 -07:00
Cao E	7577f0e40f	Add graph runner support with torch compile on CPU (#7843 )	2025-09-07 21:33:58 -07:00
Ma Mingfei	5ad296bda1	Optimize prefill performance on cpu backend (#8750 )	2025-08-28 17:21:55 -07:00
Chunyuan WU	08f8f49016	[CPU][sgl-kernel] biased_grouped_topk: fix correction_bias dtype to float32 (#8212 ) Co-authored-by: jianan-gu <jianan.gu@intel.com> Co-authored-by: YanbingJiang <yanbing.jiang@intel.com>	2025-08-04 18:28:31 -07:00
YanbingJiang	1fe691a429	Fix FP8 block quantization when N or K is not multiples of 128 (#8648 )	2025-08-01 15:57:19 -07:00
Chunyuan WU	ac80f4da57	[CPU] [FP8] set SGLANG_CPU_FP8_CVT_FTZ in CMakeLists.txt (#7885 )	2025-07-09 01:53:53 -07:00
Chunyuan WU	128f16a817	[CPU]convert topk_weights to fp32 for INT8 and FP8 paths (for llama4) and fix LmHead weight pack (#7818 )	2025-07-08 19:27:24 -07:00
Chunyuan WU	36cc3ffdc7	[CPU] [sgl-kernel] set dispatch key of initialize to CatchAll (#7734 )	2025-07-02 22:39:24 -07:00
YanbingJiang	b044400dd3	Support non-contiguous query input for extend/decode attention (#7462 )	2025-07-02 19:59:45 -07:00
Chunyuan WU	6005eceee3	[CPU] remove process_group from inputs of shm_allreduce and shm_allgather (#7486 )	2025-06-30 21:54:11 -07:00
Chunyuan WU	c5131f7a2f	[CPU] add c++ kernel to bind CPU cores and memory node (#7524 )	2025-06-29 19:45:25 -07:00
Chunyuan WU	7eb47b0f3d	[CPU] [BF16] Call fused_experts_cpu, weight_packed_linear and bmm_cpu kernel in DeepSeek model (#6641 ) Co-authored-by: Thien Tran <gau.nernst@yahoo.com.sg>	2025-06-25 01:43:33 -07:00
YanbingJiang	fcde67b016	CPU: map changes from developing branch in sgl-kernel (#6833 ) Co-authored-by: mingfeima <mingfei.ma@intel.com>	2025-06-10 01:08:15 -07:00
jianan-gu	ff00895c46	Add CPU optimized kernels for topk and rope fusions (#6456 )	2025-06-02 17:37:34 -07:00
Chunyuan WU	3ded6235c9	Add fp8 fused_experts kernel for CPU in sgl-kernel and add UT (#6404 )	2025-05-23 02:01:55 -07:00
blzheng	4ba1eea83f	Add fp8 qkv_proj_with_rope kernel for CPU in sgl-kernel and add UT (#6493 )	2025-05-23 00:14:46 -07:00
blzheng	cfe48c5902	[CPU] Fix build issue (#6419 )	2025-05-21 11:17:10 -07:00
YanbingJiang	32cc66efa5	Update extend/decode attention kernel for CPU in sgl-kernel and add UTs (#6405 ) Co-authored-by: mingfeima <mingfei.ma@intel.com>	2025-05-19 21:23:17 -07:00
Chunyuan WU	5dd62c3a6f	Add fp8 shared_expert kernel for CPU in sgl-kernel and add UT (#6339 ) Co-authored-by: Jiang, Yanbing <yanbing.jiang@intel.com> Co-authored-by: mingfeima <mingfei.ma@intel.com>	2025-05-18 12:42:15 -07:00
Chunyuan WU	fb4959b2c5	Add fp8 gemm kernel for CPU in sgl-kernel and add gemm UT (#6216 ) Co-authored-by: YanbingJiang <yanbing.jiang@intel.com> Co-authored-by: mingfeima <mingfei.ma@intel.com>	2025-05-15 09:10:40 -07:00
blzheng	0f75b907c6	[CPU] Add CMakeLists.txt for sgl-kernel (#6115 )	2025-05-13 15:30:37 -07:00
applesaucethebun	2ce8793519	Add typo checker in pre-commit (#6179 ) Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>	2025-05-11 12:55:00 +08:00
Ma Mingfei	a73c4df438	Add optimized native kernels in sgl-kernel (#5150 ) Co-authored-by: Chunyuan WU <chunyuan.wu@intel.com> Co-authored-by: YanbingJiang <yanbing.jiang@intel.com> Co-authored-by: blzheng <beilei.zheng@intel.com>	2025-04-08 09:37:46 -07:00

23 Commits