Commit Graph

13 Commits

Author SHA1 Message Date
Chunyuan WU
36cc3ffdc7 [CPU] [sgl-kernel] set dispatch key of initialize to CatchAll (#7734) 2025-07-02 22:39:24 -07:00
Chunyuan WU
6005eceee3 [CPU] remove process_group from inputs of shm_allreduce and shm_allgather (#7486) 2025-06-30 21:54:11 -07:00
Chunyuan WU
c5131f7a2f [CPU] add c++ kernel to bind CPU cores and memory node (#7524) 2025-06-29 19:45:25 -07:00
YanbingJiang
fcde67b016 CPU: map changes from developing branch in sgl-kernel (#6833)
Co-authored-by: mingfeima <mingfei.ma@intel.com>
2025-06-10 01:08:15 -07:00
jianan-gu
ff00895c46 Add CPU optimized kernels for topk and rope fusions (#6456) 2025-06-02 17:37:34 -07:00
Chunyuan WU
3ded6235c9 Add fp8 fused_experts kernel for CPU in sgl-kernel and add UT (#6404) 2025-05-23 02:01:55 -07:00
blzheng
4ba1eea83f Add fp8 qkv_proj_with_rope kernel for CPU in sgl-kernel and add UT (#6493) 2025-05-23 00:14:46 -07:00
blzheng
cfe48c5902 [CPU] Fix build issue (#6419) 2025-05-21 11:17:10 -07:00
YanbingJiang
32cc66efa5 Update extend/decode attention kernel for CPU in sgl-kernel and add UTs (#6405)
Co-authored-by: mingfeima <mingfei.ma@intel.com>
2025-05-19 21:23:17 -07:00
Chunyuan WU
5dd62c3a6f Add fp8 shared_expert kernel for CPU in sgl-kernel and add UT (#6339)
Co-authored-by: Jiang, Yanbing <yanbing.jiang@intel.com>
Co-authored-by: mingfeima <mingfei.ma@intel.com>
2025-05-18 12:42:15 -07:00
Chunyuan WU
fb4959b2c5 Add fp8 gemm kernel for CPU in sgl-kernel and add gemm UT (#6216)
Co-authored-by: YanbingJiang <yanbing.jiang@intel.com>
Co-authored-by: mingfeima <mingfei.ma@intel.com>
2025-05-15 09:10:40 -07:00
blzheng
0f75b907c6 [CPU] Add CMakeLists.txt for sgl-kernel (#6115) 2025-05-13 15:30:37 -07:00
Ma Mingfei
a73c4df438 Add optimized native kernels in sgl-kernel (#5150)
Co-authored-by: Chunyuan WU <chunyuan.wu@intel.com>
Co-authored-by: YanbingJiang <yanbing.jiang@intel.com>
Co-authored-by: blzheng <beilei.zheng@intel.com>
2025-04-08 09:37:46 -07:00