-
Notifications
You must be signed in to change notification settings - Fork 769
WIP on 'visitor' model for the SemaSYCL refactor #1495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Erich Keane <[email protected]>
Signed-off-by: Erich Keane <[email protected]>
clang/lib/Sema/SemaSYCL.cpp
Outdated
else if (ArgTy->isScalarType()) | ||
KF_FOR_EACH(handleScalarType); | ||
else | ||
llvm_unreachable("Unsupported kernel parameter type"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This almost definitely shouldn't be here. I should add a "handleOtherType" that is only implemented by the diagnostics, which simply diagnoses with an error. Everyone else can ignore it.
Just adding myself to reviewers to not miss what is going on. |
clang/lib/Sema/SemaSYCL.cpp
Outdated
void handleSyclAccessorType(const FieldDecl *, QualType){} | ||
void handleSyclSamplerType(const FieldDecl *, QualType){} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think handling for accessor and sampler types will be identical. Do we really need to split it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could swear I saw a spot where we handled them separately (but I don't have the terminal open, but I want to say the Integration Header?). I would think we can keep them split for the near future, and combine them if we find they aren't particularly useful to be separate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. The handling in integration header can be different for accessors and samplers, but It should be identical for generation of OpenCL kernel body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that matches with my understanding as well! I presume the OpenCL Kernel body/decl will likely implement theirs in a function that this just calls. Maybe I'll make that part a bit easier when I get there.
clang/lib/Sema/SemaSYCL.cpp
Outdated
}; | ||
|
||
// A type to Create and own the FunctionDecl for the kernel. | ||
class SYCLKernelHeader : public SyclKernelFieldHandler<SYCLKernelHeader> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this class is needed at all?
What is kernel obj header?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure yet :) I was brainstorming based on the current things we do. The idea was to have this + the error-checking version replace BuildArgTys. This is more just to build/own the declaration and much of the stuff that happens in ConstructOpenCLKernel.
clang/lib/Sema/SemaSYCL.cpp
Outdated
// A base type that the SYCL OpenCL Kernel construction task uses to implement | ||
// individual tasks. | ||
template<typename Derived> | ||
class SyclKernelFieldHandler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need one of these for the integration header generation.
clang/lib/Sema/SemaSYCL.cpp
Outdated
void handleSyclAccessorType(const FieldDecl *, QualType){} | ||
void handleSyclSamplerType(const FieldDecl *, QualType){} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could swear I saw a spot where we handled them separately (but I don't have the terminal open, but I want to say the Integration Header?). I would think we can keep them split for the near future, and combine them if we find they aren't particularly useful to be separate.
…or. Still need IntHeader and kernel body generation Signed-off-by: Erich Keane <[email protected]>
Did a pretty sizable amount of work this evening, and go tmyself to the point where I think I have a pretty good framework. I still don't have the body of the SYCL Kernel being generated, but I think I've replaced all of buildArgTys. I also added the type for Integration Headers, but didn't do any implementation for it. |
Signed-off-by: Erich Keane <[email protected]>
Signed-off-by: Erich Keane <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, it evolves.
I can take a look into kernel body generation.
} | ||
|
||
void handlePointerType(const FieldDecl *FD, QualType ArgTy) final { | ||
// TODO: Can we document what the heck this is doing?! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that would be nice. But it's not so hard to understand what is happening here looking into the code.
This is just handling of USM pointers. I haven't seen any doc about implementation details of USM, only specs (https://github.com/intel/llvm/tree/sycl/sycl/doc/extensions/USM).
Basically USM allows to use raw pointers instead of buffers/accessors, but these pointers point to the specially allocated memory. For any pointer we add some argument with the same type but global address space, because OpenCL requires it.
The problem here that we don't know how to differentiate USM pointers from any others, so we stupidly consider them all as USM pointers. This can lead to some problems, I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've read both the code and your post and don't have a good hold of it other than it seems to be just creating a pointer to the same type, except global. Patches that do a good job explaining the details would be welcome :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi. Sorry.
At least when I did this, kernel args to OpenCL that are pointers had to be in the global address space.
I make no claims to having done the right thing, only that I did something that got pointers to work.
And yes - I don't see a way to disambiguate 'normal' C++ pointers from USM pointers since USM is very specifically trying to avoid modifying the type system.
Previously, the compiler was rejecting all programs with pointers, so things that should work wouldn't.
We had to make it so any pointer works, which means that legal things now work, but illegal things will crash. I still view that as an improvement.
clang/lib/Sema/SemaSYCL.cpp
Outdated
|
||
// Generates the OpenCL kernel using KernelCallerFunc (kernel caller | ||
// function) defined is SYCL headers. | ||
// function) defined is Sycl headers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was being incredibly inconsistent with my naming at one point, so I ran a find/replace. I guess it got a little out of hand :)
clang/lib/Sema/SemaSYCL.cpp
Outdated
static FunctionDecl *initKernelObj(ASTContext &Ctx, StringRef Name, | ||
SourceLocation Loc, bool IsInline) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get why it is called initKernelObj
. Shouldn't it be something like createKernelDecl
?
BTW, I don't think that OpenCL kernel somewhat can be inline. o_O
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implicit inline logic came from the existing implementation.
Naming things is hard... 'initKernelObj' (with the variable named KernelObj) made sense at the time shrug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think because the name KernelObj
is incorrect.
FunctionDecl *KernelObj;
I think this variable contains declaration (and definition after we generate the body) of OpenCL kernel. We have to rename it so it will be easy to understand that this is OpenCL kernel function declaration. Something like KernelDecl
.
clang/lib/Sema/SemaSYCL.cpp
Outdated
// A type that handles the accessor recursion and acts as a base for | ||
// SyclKernelDeclCreator. It doesn't 'own' anything other than the KernelObj | ||
// pointer and functionality required to add a param. | ||
class SyclKernelDeclBase : public SyclKernelFieldHandler<SyclKernelDeclBase> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really get why do we need a base and a child class for kernel decl creation.
Could you please elaborate a bit more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thought is there are 2 extremely similar cases: 1- When dealing with the 'root' of the call tree, and 2- when recursing down a 'struct'.
The struct only gets 2 of the cases (recursing down the struct plus accessors). However, because the two cases are so extremely similar (and the 'base' needs just about all the fields/abilities anyway), it seemed to make sense to make it a base class.
I'd love to see another suggestion here, I wanted to maximize code reuse without having a bunch of free-functions that seemed to live outside the visitor, but I didn't immediately come up with a better one.
I'll do a bunch of the renaming/changes that @Fznamznon suggested on this patch. |
Signed-off-by: Erich Keane <[email protected]>
Signed-off-by: Erich Keane <[email protected]>
…ill need to fix the integration header offset calculation/keeping track Signed-off-by: Erich Keane <[email protected]>
…. Quitting for the weekend, pushing so that others can have a look at the progress/help how they can
Ok, pushed a couple more commits. These changed the look of the visitor slightly to handle recursion and base types. Additionally, it SHOULD properly handle accessors as both fields and base types all the way up the tree. We need tests for that, there are quite a few TODOs, we still need the body implementation, it would be nice to get Elizabeth's implementation of memcpy working alright (perhaps after body + a squash?), and it would be nice to get arrays working as well. For now, I'm just quitting for the weekend, I'll pick this up on monday. |
Signed-off-by: Erich Keane <[email protected]>
Signed-off-by: Erich Keane <[email protected]>
Signed-off-by: Erich Keane <[email protected]>
…y creator to get the list of parameters for the currently handled field Signed-off-by: Erich Keane <[email protected]>
else if (Util::isSyclStreamType(ItemTy)) | ||
(void)std::initializer_list<int>{ | ||
(handlers.handleSyclStreamType(Item, ItemTy), 0)...}; | ||
else if (ItemTy->isStructureOrClassType()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you need the following, otherwise an array of accessors within a struct do not get processed.
else if (ItemTy->isArrayType())
(void)std::initializer_list<int>{
(handlers.handleArrayType(Item, ItemTy), 0)...};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... I didn't see in the old code it recursing through arrays, are we intending to split them out as well?
In that case, we probably need to better handle the offsets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue captured here: #1567
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is quite natural to recurse through structs to any level to discover accessors and the new visitor model does that.
The existing implementation does not deal with arrays at all so you won't see any code there.
My support for arrays accounted for arrays of accessors, i.e., the array type is one of the special sycl types. When such an array type was encountered it would "unroll" each element of the array and generate a separate descriptor for it. It did not try to handle arrays of other struct types that contained accessors (or arrays of accessors) as members, or more deeply nested members.
I suggest we support arrays of accessors only, not arrays of structs that may in turn contain accessors.
If so, the proposed addition to the visitor function will do the job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd have to support both, arrays of structs of accessors, or even arrays of structs of arrays of accessors.
It is a presumably fairly easy addition however, but I'll hold it off for another patch, we have a fairly sizable change coming from @elizabethandrews on how parameters are passed, so I'd like to hold off on arrays until after that.
// class/field graph. Int Headers use this to calculate offset, most others | ||
// don't have a need for these. | ||
|
||
virtual void enterStruct(const CXXRecordDecl *, const FieldDecl *) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if leaveStruct will ever be needed. The offsets generated for the integration header are monotonically increasing. If the lambda object members are visited in ascending sequence, the offset value should be steadily increasing, thus it will never need bumping down.
I confess I don't fully understand the base class stuff either. The old code made no reference to base classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did the 'leaveStruct' so that we can 'undo' the offset, that way we don't have to store the entire 'stack' of parents. Instead, we can store just a single offset, and add/subtract it as needed. For example, if we are deep in a set of structures and see:
struct S { accessor a; };
struct U { /*other fields;*/S s1, s2; };
We would have to 'know' about U and the entire set of parents in order to calculate the offset. By doing the 'leaveStruct' (see the Int Header implementation), each time you enter a field, you simply increase by your offset, then decrease by the same offset when we leave. So in the above:
We're in 'U', offset is something like 95 (random number).
We enter 's1', which we know is at offset '40' in 'U', so we increase our offset by that much.
We leave 's1', we can calculate the offset as '40' in 'U' and just decrease it, getting us back to 95.
Then, we enter 's2', we can do the same.
Otherwise, we'd have to know 'U''s offset, and our current field/bases' offset as well. I couldn't come up with a way of doing that which didn't store the stack of currently visited things.
Should we close this PR? |
This is my first run at how I think we can re-arch this to better share work. Suggestions/patches welcome :)
I've implemented ONLY the arg-type diagnostics so far, but I intend to start looking at other parts when I get a chance.
Signed-off-by: Erich Keane [email protected]