David Zhao
79c1160b07
cuda: refactored ssm_scan and use CUB ( #13291 )
...
* cuda: refactored ssm_scan to use CUB
* fixed compilation error when when not using CUB
* assign L to constant and use size_t instead of int
* deduplicated functions
* change min blocks per mp to 1
* Use cub load and store warp transpose
* suppress clang warning
2025-08-09 20:29:43 +02:00
..
2025-08-08 14:37:22 +02:00
2025-08-06 14:12:42 +08:00
2025-08-08 14:37:22 +02:00
2025-08-09 20:29:43 +02:00
2025-08-07 16:44:14 +02:00
2025-08-05 22:10:36 +03:00
2025-07-24 20:05:37 +01:00
2025-08-08 14:37:22 +02:00
2025-08-08 14:37:22 +02:00
2025-08-08 14:37:22 +02:00
2025-08-08 14:37:22 +02:00
2025-08-06 15:14:40 -07:00
2025-08-04 21:29:14 +02:00
2025-08-05 22:10:36 +03:00
2025-02-28 14:41:47 +01:00
2025-08-04 21:29:14 +02:00
2025-08-06 14:37:35 +02:00
2025-08-05 22:10:36 +03:00
2025-08-05 22:10:36 +03:00
2025-05-19 13:29:56 +03:00
2025-08-08 17:48:26 -04:00
2025-08-05 22:10:36 +03:00
2024-11-14 18:04:35 +01:00
2024-12-12 19:02:49 +01:00
2025-08-05 22:10:36 +03:00
2025-06-01 13:43:57 +03:00
2025-07-09 14:33:53 +02:00