maxiao
|
852a49c5cc
|
adapt to dsv32 on dcu
|
2025-09-30 18:37:31 +08:00 |
|
Lianmin Zheng
|
e290303ea1
|
[Auto Sync] Update elementwise.py (20250923) (#10823)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Cheng Wan <54331508+ch-wan@users.noreply.github.com>
|
2025-09-23 13:50:22 -07:00 |
|
Lianmin Zheng
|
86d10d220f
|
Update grok.py and tiktoken tokenizer (#9532)
|
2025-08-23 05:40:18 -07:00 |
|
Lianmin Zheng
|
22352d47a9
|
Improve streaming, log_level, memory report, weight loading, and benchmark script (#7632)
Co-authored-by: Kan Wu <wukanustc@gmail.com>
|
2025-06-29 23:16:19 -07:00 |
|
kk
|
5a144a8ab9
|
Fix run time error in ROCm platform (#5147)
Co-authored-by: wunhuang <wunhuang@amd.com>
Co-authored-by: root <root@dell300x-pla-t10-17.pla.dcgpu>
|
2025-04-07 22:49:40 -07:00 |
|
Lianmin Zheng
|
c6d7f8d370
|
Add some fused elementwise kernels for grok-1 (#4398)
Co-authored-by: dhou-xai <dhou@x.ai>
Co-authored-by: Hanming Lu <69857889+hanming-lu@users.noreply.github.com>
|
2025-03-13 13:39:10 -07:00 |
|