-
Notifications
You must be signed in to change notification settings - Fork 770
[SYCL] Add Clang support for FPGA loop fusion function attributes #2877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This patch adds support for FPGA function attributes loop_fuse and loop_fuse_independent. [[intel::loop_fuse(N)]] is a strong request, to the extent possible, to fuse loops within the function, that are contained in at most N-1 other loops within the function. If the optional parameter N is omitted, it is a strong request, to the extent possible, to fuse loops within the function that are not contained in any other loop within the function. [[intel::loop_fuse_independent(N)]] is used to guarantee that fusion safety analysis can ignore negative-distance dependences between these loops. FrontEnd Specifications: The attributes take one optional parameter, a constant integral expression between 0 and 1024*1024. The paramter may be a template parameter. The same function definition can have atmost one of these two attributes. The attributes can be applied explictly to kernel. However, attributes should not be propagated to callers i.e it should not be propagated from device functions to kernel. LLVM IR is function metadata as follows: define i32 @foo() !loop_fuse !0 !0 = !{i32 N, i32 D} where N is the value specified by the optional attribute argument. If the optional argument is omitted, N is set to 1. D is equal to 0 for [[intel::loop_fuse]] and 1 for [[intel::loop_fuse_independent]]. Signed-off-by: Elizabeth Andrews <[email protected]>
I was just told this attribute should be disabled for host. I will be pushing a new patch with host support removed. I apologize for the confusion. |
clang/lib/Sema/SemaDeclAttr.cpp
Outdated
// Handle optional attribute argument. | ||
if (Attr.isArgExpr(0)) | ||
// Attribute argument specified. | ||
S.AddOneConstantValueAttr<AttrType>(D, Attr, Attr.getArgAsExpr(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for pointing out where the implementation for that is! FWIW, I would have concerns about upstreaming that function. Nothing for you to solve as part of this review though.
clang/lib/Sema/SemaDeclAttr.cpp
Outdated
Attr)) | ||
return nullptr; | ||
|
||
if (D->hasAttr<SYCLIntelLoopFuseAttr>()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The usual pattern is: the mergeFooAttr()
function checks all of the merging constraints and diagnoses problematic situations like mutually exclusive attributes, duplicate attributes with different arguments, etc, and the handleFooAttr()
function checks all of the basic constraints and diagnoses problematic situations like incorrect number of attribute arguments, wrong argument type, etc. and if everything looks okay, calls mergeFooAttr()
and only adds the resulting attribute from the merge function if it's nonnull.
The availability attribute is an example of one that has somewhat complicated checking in both handle
and merge
. What you have in this patch is somewhat close, but the merge function is not diagnosing the duplicate attribute before dropping it, and the handle function is doing a bit of manual work rather than calling the merge function.
Signed-off-by: Elizabeth Andrews <[email protected]>
I am curious about this D parameter between 0 and 1024*1024... |
1. Changed attribute definitions to define only 1 semantic attribute 2. Added a couple of tests 3. Minor refactoring/code formatting Signed-off-by: Elizabeth Andrews <[email protected]>
N indicates nesting. So it needs to be 0 or more. The upper limit is just an arbitrary high value. |
Signed-off-by: Elizabeth Andrews <[email protected]>
Signed-off-by: Elizabeth Andrews <[email protected]>
Signed-off-by: Elizabeth Andrews <[email protected]>
during host compilation. Signed-off-by: Elizabeth Andrews <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM aside from some minor nits with comments.
Are the buildbots down? @romanovvlad |
Signed-off-by: Elizabeth Andrews <[email protected]>
26bbc04
Thanks for the reviews everyone! The final commit just adds missing fullstops. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (thanks to a thorough review by @AaronBallman), I just had one question below.
They should be restored now. |
Hmm... That's strange... The test passed locally for me even after I used -triple spir64-unknown-windows-sycldevice |
Signed-off-by: Elizabeth Andrews <[email protected]>
702bed5
I think the test fail might have been a name mangling issue. I don't have a Windows workspace at the moment to confirm and I was unable to reproduce the issue on Linux even after using windows triple. Hopefully the latest commit fixes the test failure. If It doesn't I guess I'll make a Windows workspace to try and reproduce the fail. |
Semantically we can translate `OpIAddCarry` and `OpISubBorrow` back into `@llvm.uadd.with.overflow` and `@llvm.usub.with.overflow` respectively. It require small transformation as there is a difference between return types in SPIR-V and LLVM IR - e.g., SPIR-V instructions return {i32, i32} struct, but LLVM intrinsics return {i32, i1}. Original commit: KhronosGroup/SPIRV-LLVM-Translator@44bd69ee70b79ed
This patch adds support for FPGA function attributes loop_fuse and
loop_fuse_independent. [[intel::loop_fuse(N)]] is a strong request,
to the extent possible, to fuse loops within the function, that are
contained in at most N-1 other loops within the function. If the
optional parameter N is omitted, it is a strong request, to the extent
possible, to fuse loops within the function that are not contained in
any other loop within the function. [[intel::loop_fuse_independent(N)]]
is used to guarantee that fusion safety analysis can ignore
negative-distance dependences between these loops.
FrontEnd Specifications:
The attributes take one optional parameter, a constant integral
expression between 0 and 1024*1024. The paramter may be a template
parameter.
The same function definition can have atmost one of these two attributes.
The attributes can be applied explictly to kernel. However, attributes
should not be propagated to callers i.e it should not be propagated from
device functions to kernel.
LLVM IR is function metadata as follows:
define i32 @foo() !loop_fuse !0
!0 = !{i32 N, i32 D}
where N is the value specified by the optional attribute argument. If
the optional argument is omitted, N is set to 1. D is equal to 0 for
[[intel::loop_fuse]] and 1 for [[intel::loop_fuse_independent]].
Signed-off-by: Elizabeth Andrews [email protected]