-
Notifications
You must be signed in to change notification settings - Fork 11.4k
SYCL: Add fp16 type support to unary op kernels #12788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I find there is no UT cases for FP16 be opened. |
I think there are UT cases that are present in test backend ops which was disabled at that time by me (in #12201 ) |
OK, I suggest enabling them and use them test this PR. |
Already enabled and tested. |
It seems that in actual inference of a fp16 model(gemma 2 2B F16 in this case), the intermediate hidden embeddings are converted to fp32:
So, there is no way to test the numerical stability of the fp16 operations with the exception of test-backend-ops:
I am marking this PR "ready for review" for now to get some comments from others. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's OK!
Thank you!
Thank you! Lets wait for other's comment before we merge it. |
9553c5b
to
fc8d0a6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either we do this by checking if the build is compiled with GGML_SYCL_F16 compile flag and disable it in device_supports_op function or we add info about current hardware features and check using a function.
Did we reach a conclusion on this question? I think we should not instantiate fp16 kernels if the user does not provide GGML_SYCL_F16.
eed23cd
to
8398060
Compare
There are probably better ways to do this.
Need to disable fp16 support on devices which does not support fp16 in hardware.
Either we do this by checking if the build is compiled with GGML_SYCL_F16 compile flag and disable it in
device_supports_op
function or we add info about current hardware features and check using a function.Need proper testing..