You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running MAT_MUL, divide the work up into small chunks, and execute the chunks.
Currently if one thread stalls, it'll delay the final result. Testing shows that with this change, threads finish within a few nanoseconds of each other instead of being spread out over 1-2 ms, the total time is also faster and more consistent.
0 commit comments