Lianmin Zheng
|
00c7e6368b
|
Release v0.3.3.post1 (#1636)
|
2024-10-11 07:56:16 -07:00 |
|
Janumala Akhilendra
|
81c3327402
|
Added a "Back To Top" Button (#1633)
|
2024-10-11 06:25:30 -07:00 |
|
Lianmin Zheng
|
5476ccad8f
|
Update README.md
|
2024-10-11 01:59:49 -07:00 |
|
Lianmin Zheng
|
b040ed71f7
|
Update README.md (#1629)
|
2024-10-11 01:58:25 -07:00 |
|
Kushal Agrawal
|
c9e6658699
|
Update README.md (#1625)
|
2024-10-11 01:57:42 -07:00 |
|
Lianmin Zheng
|
7b69d91b4f
|
Release v0.3.3 (#1605)
|
2024-10-08 12:58:41 -07:00 |
|
Lianmin Zheng
|
f7cce751f9
|
Update README.md (#1591)
|
2024-10-06 15:14:29 -07:00 |
|
Ying Sheng
|
1c1bdc7699
|
[Event] Update README.md (#1572)
|
2024-10-05 11:16:47 -07:00 |
|
Ikko Eltociear Ashimine
|
f8fb4ce9b0
|
chore: update README.md (#1580)
|
2024-10-05 11:05:57 -07:00 |
|
Theresa Barton
|
2c7d0a5b8b
|
[Fix] Fix all the Huggingface paths (#1553)
|
2024-10-02 10:12:07 -07:00 |
|
Lianmin Zheng
|
048685430d
|
Improve process creation (#1534)
|
2024-09-29 02:36:12 -07:00 |
|
Lianmin Zheng
|
4e4459b91f
|
Multiple minor fixes (#1530)
|
2024-09-28 14:43:35 -07:00 |
|
Kylin
|
f42e9bfb52
|
[bugfix] Add modelscope package to avoid docker image without modelscope (#1520)
|
2024-09-28 12:43:22 -07:00 |
|
Ying Sheng
|
b1e330bcb0
|
[Event] Update meeting link (#1529)
|
2024-09-27 13:30:04 -07:00 |
|
Ying Sheng
|
37c5899fc2
|
Release v0.3.2 (#1512)
|
2024-09-25 14:17:09 +08:00 |
|
TianyiQ
|
3c93187caf
|
Add support for tie_word_embeddings when loading weights + support for SmolLM (#1508)
|
2024-09-24 21:50:20 -07:00 |
|
Lianmin Zheng
|
167591e864
|
Better unit tests for adding a new model (#1488)
|
2024-09-22 01:50:37 -07:00 |
|
Yineng Zhang
|
82136eb0b5
|
chore: bump v0.3.1.post3 (#1483)
|
2024-09-21 11:17:45 +08:00 |
|
Niklas Muennighoff
|
014982b5e0
|
Add OLMoE (#1476)
|
2024-09-20 10:32:49 +08:00 |
|
Lianmin Zheng
|
5ce55aee15
|
Release v0.3.1.post2 (#1470)
|
2024-09-19 02:03:38 -07:00 |
|
Lianmin Zheng
|
2d346a57c2
|
Fix padding in the cuda graph (#1469)
|
2024-09-19 01:52:15 -07:00 |
|
Ying Sheng
|
8f527e2940
|
[Event] Add public meeting invite to README (#1458)
|
2024-09-18 23:53:22 +08:00 |
|
Ke Bao
|
c6b6d2e71b
|
Enable MLA by default (#1447)
|
2024-09-17 11:42:48 +00:00 |
|
Lianmin Zheng
|
90a26be31c
|
Release 0.3.1.post1 (#1445)
|
2024-09-17 01:47:31 -07:00 |
|
Lianmin Zheng
|
e79f6cd73d
|
Release v0.3.1 (#1430)
|
2024-09-15 23:03:16 +09:00 |
|
Lianmin Zheng
|
9463bc1385
|
Enable torch.compile for triton backend (#1422)
|
2024-09-14 15:38:37 -07:00 |
|
hxer7963
|
c33d82a211
|
Add Support for XVERSE Models (Dense and MoE) to sglang (#1397)
Co-authored-by: will he <hexin@xverse.cn>
Co-authored-by: root <root@localhost.localdomain>
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
|
2024-09-12 01:47:52 -07:00 |
|
William
|
2a71be5e25
|
Fix README format (#1399)
|
2024-09-11 23:46:51 -07:00 |
|
Vectory
|
224200e3c2
|
BaiChuan2 Model (#1367)
Co-authored-by: wanpenghan <wanpenghan@sohu-inc.com>
|
2024-09-11 03:55:24 -07:00 |
|
Lianmin Zheng
|
46094e0c1b
|
Deprecate --disable-flashinfer and introduce --attention-backend (#1380)
|
2024-09-10 17:11:16 -07:00 |
|
William
|
e72275cf7f
|
Support MiniCPM3 (#1371)
|
2024-09-10 19:57:52 +10:00 |
|
Lianmin Zheng
|
8d1095dbf0
|
[Docs] Improve documentations (#1368)
|
2024-09-09 20:48:28 -07:00 |
|
Yineng Zhang
|
5ab9418f5b
|
[Doc] update news (#1327)
|
2024-09-04 04:21:21 -07:00 |
|
Yineng Zhang
|
a63c8275c6
|
chore: bump v0.3.0 (#1320)
|
2024-09-04 04:32:18 +08:00 |
|
Lianmin Zheng
|
c500f96bb1
|
Update README.md for llava-onevision instructions (#1313)
|
2024-09-03 01:43:08 -07:00 |
|
Lianmin Zheng
|
f64eae3a29
|
[Fix] Reduce memory usage for loading llava model & Remove EntryClassRemapping (#1308)
|
2024-09-02 21:44:45 -07:00 |
|
Lianmin Zheng
|
9999442756
|
Release v0.2.15 (#1295)
|
2024-09-01 22:22:38 -07:00 |
|
Byron Hsu
|
4a9f8ea43b
|
[doc] Fix more broken links (#1294)
|
2024-09-01 14:46:36 -07:00 |
|
Byron Hsu
|
6cc9c52521
|
[doc] fix quick start link (#1282)
|
2024-08-31 22:54:34 -07:00 |
|
Lianmin Zheng
|
79ece2c51f
|
Report median instead of mean in bench_latency.py (#1269)
|
2024-08-30 06:05:01 -07:00 |
|
김종곤
|
55f5976b42
|
Update README.md - Supported Models add Exaone 3.0 (#1267)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
|
2024-08-30 18:49:07 +10:00 |
|
Yineng Zhang
|
13ac95b894
|
chore: bump v0.2.14.post2 (#1250)
|
2024-08-28 18:46:33 +00:00 |
|
Lianmin Zheng
|
bf53bf5142
|
[Fix] Fix llava on multi images (#1247)
|
2024-08-28 06:33:05 -07:00 |
|
Yineng Zhang
|
f25f4dfde5
|
hotfix: revert sampler CUDA Graph (#1242)
|
2024-08-28 21:16:47 +10:00 |
|
Lianmin Zheng
|
184ae1c683
|
Update README.md (#1239)
|
2024-08-28 02:15:52 -07:00 |
|
Yineng Zhang
|
198974cd1a
|
feat: support sm75 with FlashInfer v0.1.6 (#1233)
|
2024-08-28 18:39:12 +10:00 |
|
Dr. Artificial曾小健
|
c8a9e79186
|
Fix readme (#1236)
|
2024-08-27 23:51:41 -07:00 |
|
Yineng Zhang
|
c5fe11a8e1
|
chore: bump v0.2.14 (#1155)
|
2024-08-27 00:28:24 +10:00 |
|
Chayenne
|
30b4f771b0
|
Support Alibaba-NLP/gte-Qwen2-7B-instruct embedding Model (#1186)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
|
2024-08-25 10:29:12 -07:00 |
|
Lianmin Zheng
|
b20daf982a
|
Update README.md (#1198)
|
2024-08-24 14:50:05 -07:00 |
|