-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Common Implemenatation for MatMul and MatMulTran for both aligned and unaligned arrays #1218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Before
After
// update with changes |
@tannergooding are the performance tests fine ? |
test/Microsoft.ML.CpuMath.PerformanceTests/AvxPerformanceTests.cs
Outdated
Show resolved
Hide resolved
test/Microsoft.ML.CpuMath.PerformanceTests/SsePerformanceTests.cs
Outdated
Show resolved
Hide resolved
@eerhardt @tannergooding can you take another look ? |
@tannergooding @eerhardt cany more feedback ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, LGTM
Fixes #1245
For inputs that are not naturally aligned (the alignment is not a multiple of 4), it does exclusively unaligned loads
For all other inputs, it will do at most two unaligned loads (one each for any leading/trailing unaligned elements) and all other loads will be aligned.
cc @tannergooding @eerhardt @danmosemsft @TomFinley