|
|
c84151eef9
|
fix issues
|
2026-06-26 12:55:02 +08:00 |
|
|
|
b5806731e0
|
some op overhead optimization
|
2026-06-19 11:19:39 +08:00 |
|
|
|
50e3a05fb0
|
fix incorrect MoE step to ensure decoding speed
|
2026-06-12 11:44:50 +08:00 |
|
|
|
629f878c28
|
initial commit for qwen3.6-moe adaptation
|
2026-06-12 10:10:49 +08:00 |
|
|
|
2d1ef50992
|
chunked prefill support and memory opts
|
2026-06-05 16:03:34 +08:00 |
|
|
|
8c047a70ea
|
some modifications to ensure 50K context input
|
2026-06-04 17:56:29 +08:00 |
|
|
|
3ef8227384
|
initial version of adding chunked attention, ensuring 20K context
|
2026-05-29 16:49:33 +08:00 |
|
|
|
0e89906481
|
Qwen3.6-27B iluvatar bi-v100 adaptation
|
2026-05-21 16:37:24 +08:00 |
|