YanbingJiang
|
1fe691a429
|
Fix FP8 block quantization when N or K is not multiples of 128 (#8648)
|
2025-08-01 15:57:19 -07:00 |
|
YanbingJiang
|
0e05fe8cf4
|
Update seed in CPU UTs to avoid flaky failure with single test (#7544)
|
2025-06-25 21:25:50 -07:00 |
|
Chunyuan WU
|
9179ea1595
|
add seed in CPU UTs to avoid flaky failure (#7333)
|
2025-06-18 19:12:14 -07:00 |
|
YanbingJiang
|
094c116f7d
|
Update python API of activation, topk, norm and rope and remove vllm dependency (#6614)
Co-authored-by: Wu, Chunyuan <chunyuan.wu@intel.com>
Co-authored-by: jianan-gu <jianan.gu@intel.com>
Co-authored-by: sdp <sdp@gnr799219.jf.intel.com>
|
2025-06-17 22:11:50 -07:00 |
|
YanbingJiang
|
fcde67b016
|
CPU: map changes from developing branch in sgl-kernel (#6833)
Co-authored-by: mingfeima <mingfei.ma@intel.com>
|
2025-06-10 01:08:15 -07:00 |
|
Chunyuan WU
|
3ded6235c9
|
Add fp8 fused_experts kernel for CPU in sgl-kernel and add UT (#6404)
|
2025-05-23 02:01:55 -07:00 |
|