ll819214
|
506a2d5934
|
npu fused op (#7386)
Co-authored-by: Li Junwen <lijunwen13@hisilicon.com>
|
2025-06-25 01:54:20 -07:00 |
|
YanbingJiang
|
094c116f7d
|
Update python API of activation, topk, norm and rope and remove vllm dependency (#6614)
Co-authored-by: Wu, Chunyuan <chunyuan.wu@intel.com>
Co-authored-by: jianan-gu <jianan.gu@intel.com>
Co-authored-by: sdp <sdp@gnr799219.jf.intel.com>
|
2025-06-17 22:11:50 -07:00 |
|
Yijie Zhu
|
a39d928782
|
support qwen2 running on ascend npu device (#7022)
Co-authored-by: 刁莹煜 <diaoyingyu1@hisilicon.com>
|
2025-06-17 11:24:10 -07:00 |
|
woodx
|
e30ef368ab
|
Feat/support rerank (#6058)
|
2025-06-16 10:50:01 -07:00 |
|
JieXin Liang
|
97cb762bb6
|
[misc] remove is_cuda_available (#5319)
|
2025-04-20 18:16:51 -07:00 |
|
Lianmin Zheng
|
177320a582
|
Clean up imports (#5467)
|
2025-04-16 15:26:49 -07:00 |
|
Yineng Zhang
|
65b7c9b78f
|
cleanup deps 2/n (#4464)
|
2025-03-15 23:06:17 -07:00 |
|
Xiuyu Li
|
9545bfb28a
|
fix: support gelu_new activation function in gpt2 (#3712)
|
2025-03-04 04:09:52 -08:00 |
|
Yineng Zhang
|
8db776f049
|
support QuickGELU (#3250)
|
2025-02-01 19:31:47 +08:00 |
|
Yineng Zhang
|
4eb4b401cc
|
update and simplify CustomOp (#3249)
|
2025-02-01 18:56:44 +08:00 |
|
Yineng Zhang
|
2f79f58873
|
feat: use sgl-kernel 0.0.3 in sglang (#3179)
|
2025-01-27 21:39:52 +08:00 |
|
Yineng Zhang
|
5dc54f1a62
|
feat: remove vllm distributed (#2907)
Co-authored-by: Zhangyi <1109276519@qq.com>
|
2025-01-17 22:31:51 +08:00 |
|
Xuehai Pan
|
62a4a339eb
|
docs: fix module docstrings and copyright headers (#2077)
|
2024-11-22 22:16:53 +08:00 |
|
Yineng Zhang
|
766192610e
|
feat: update torch 2.5.1 (#2069)
|
2024-11-18 21:29:13 +08:00 |
|
Lianmin Zheng
|
c1f401fc58
|
Revert "chore: update torch v2.5.1" (#2063)
|
2024-11-17 15:29:38 -08:00 |
|
Yineng Zhang
|
3b878863f7
|
chore: update torch v2.5.1 (#1849)
|
2024-11-18 00:06:00 +08:00 |
|
Lianmin Zheng
|
ebbc42d989
|
Optimize broadcast & Reorg code (#1598)
|
2024-10-07 13:19:23 -07:00 |
|
Lianmin Zheng
|
6a5b352aaf
|
Use is_flashinfer_available to replace is_hip for flashinfer check (#1596)
Co-authored-by: Zhang Liangang <liangang.zhang@intel.com>
|
2024-10-06 22:54:05 -07:00 |
|
Yineng Zhang
|
b4408b0d16
|
feat: update linear deps 1/N (#1305)
|
2024-09-19 20:53:11 +08:00 |
|
HAI
|
aa2750beb3
|
[Bugfix] Enable SGLang on AMD GPUs via PyTorch for ROCm (#1419) (#1453)
|
2024-09-18 02:01:35 -07:00 |
|
HAI
|
3a6e04185b
|
[Feature, Hardware] Enable SGLang on AMD GPUs via PyTorch for ROCm (#1420)
|
2024-09-17 07:43:52 +00:00 |
|
Yineng Zhang
|
c411f32e1c
|
feat: replace GeluAndMul (#1234)
|
2024-08-28 14:07:02 +00:00 |
|
Yineng Zhang
|
198974cd1a
|
feat: support sm75 with FlashInfer v0.1.6 (#1233)
|
2024-08-28 18:39:12 +10:00 |
|
Yineng Zhang
|
3602692c7c
|
feat: replace get_act_fn for gpt_bigcode (#1231)
|
2024-08-27 21:15:31 +10:00 |
|
Yineng Zhang
|
c9064e6fd9
|
feat: use gelu_tanh_and_mul (#1193)
|
2024-08-24 01:58:16 -07:00 |
|
Yineng Zhang
|
1fb9459908
|
fix: custom op fallback forward native when lower sm80 (#1177)
|
2024-08-21 14:26:35 -07:00 |
|
Lianmin Zheng
|
a59636bb5e
|
Update grok 1 model (#1095)
|
2024-08-14 04:40:44 -07:00 |
|
Lianmin Zheng
|
fb1f28cbbb
|
Clean up the comments and names under python/sglang/srt/layers (#1047)
|
2024-08-12 05:54:37 +00:00 |
|
Yineng Zhang
|
c245b78973
|
hotfix: add CustomOp abstraction (#1027)
|
2024-08-11 02:45:59 -07:00 |
|
Yineng Zhang
|
94752ac811
|
feat: use FlashInfer rmsnorm and silu (#907)
|
2024-08-11 14:57:13 +10:00 |
|