-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL]Enable Dead Function Elimination in sycl-post-link #2723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -124,6 +124,10 @@ static cl::opt<bool> EmitKernelParamInfo{ | |
"emit-param-info", cl::desc("emit kernel parameter optimization info"), | ||
cl::cat(PostLinkCat)}; | ||
|
||
static cl::opt<bool> DeadFunctionElimination{ | ||
"dead-function-elimination", | ||
cl::desc("Eliminate dead functions in device image"), cl::cat(PostLinkCat)}; | ||
|
||
struct ImagePropSaveInfo { | ||
bool NeedDeviceLibReqMask; | ||
bool DoSpecConst; | ||
|
@@ -566,6 +570,32 @@ static string_vector saveResultSymbolsLists(string_vector &ResSymbolsLists) { | |
return std::move(Res); | ||
} | ||
|
||
// Eliminate 'dead' functions which are not called in device LLVM IR module, | ||
// there is one execption: functions with 'reference-indirectly' attribute | ||
// can't be eliminated since they will be called indirectly via function ptr. | ||
static void eliminateDeadFunctions(Module &M) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This code duplicated functionality LLVM provides with GlobalDCEPass. I suggest re-using LLVM pass to avoid unnecessary maintenance. |
||
std::vector<Function *> DeadFunctions; | ||
bool NoDeadFunction = false; | ||
while (!NoDeadFunction) { | ||
DeadFunctions.clear(); | ||
for (Function &F : M) { | ||
if (F.user_empty() && (F.getCallingConv() == CallingConv::SPIR_FUNC) && | ||
!F.getAttributes().hasFnAttribute("referenced-indirectly")) { | ||
F.deleteBody(); | ||
DeadFunctions.push_back(&F); | ||
} | ||
} | ||
|
||
if (!DeadFunctions.empty()) { | ||
for (Function *F : DeadFunctions) { | ||
M.getFunctionList().remove(F); | ||
} | ||
NoDeadFunction = false; | ||
} else | ||
NoDeadFunction = true; | ||
} | ||
} | ||
|
||
#define CHECK_AND_EXIT(E) \ | ||
{ \ | ||
Error LocE = std::move(E); \ | ||
|
@@ -612,11 +642,18 @@ int main(int argc, char **argv) { | |
"will produce single output file example_p.bc suitable for SPIRV\n" | ||
"translation.\n"); | ||
|
||
bool DoSplit = SplitMode.getNumOccurrences() > 0; | ||
bool DoSpecConst = SpecConstLower.getNumOccurrences() > 0; | ||
bool DoParamInfo = EmitKernelParamInfo.getNumOccurrences() > 0; | ||
|
||
if (!DoSplit && !DoSpecConst && !DoSymGen && !DoParamInfo) { | ||
// DeadFunctionElimination is used for removing some unused function in | ||
// ORIGINAL IR, it must work with IROutputOnly and can't work with other | ||
// options such as DoSplit, DoSpecConst, DoParamInfo... | ||
bool DoSplit = | ||
(SplitMode.getNumOccurrences() > 0 && !DeadFunctionElimination); | ||
bool DoSpecConst = | ||
(SpecConstLower.getNumOccurrences() > 0 && !DeadFunctionElimination); | ||
bool DoParamInfo = | ||
(EmitKernelParamInfo.getNumOccurrences() > 0 && !DeadFunctionElimination); | ||
|
||
if (!DoSplit && !DoSpecConst && !DoSymGen && !DoParamInfo && | ||
!DeadFunctionElimination) { | ||
errs() << "no actions specified; try --help for usage info\n"; | ||
return 1; | ||
} | ||
|
@@ -648,6 +685,12 @@ int main(int argc, char **argv) { | |
if (OutputFilename.getNumOccurrences() == 0) | ||
OutputFilename = (Twine(sys::path::stem(InputFilename)) + ".files").str(); | ||
|
||
if (DeadFunctionElimination && IROutputOnly) { | ||
eliminateDeadFunctions(*MPtr); | ||
saveModule(*MPtr, OutputFilename); | ||
return 0; | ||
} | ||
|
||
std::map<StringRef, std::vector<Function *>> GlobalsSet; | ||
|
||
if (DoSplit || DoSymGen) { | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side comment: I'm sure I understand the first comment in this scope.
I think
opt
is the right tool to do LLVM IR transformations. Adding "DeadFunctionElimination" functionality tosycl-post-link
tool seems like a bad idea as it duplicates already existingopt
functionality.I think we should remove "IR transformations" functionality from the
sycl-post-link
tool and use dedicatedopt
tool instead. I don't have specific instructions how to re-write the code, but we should work in this direction.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @bader
I agree that moving the "DeadFunctionElimination" functionality from sycl-post-link to llvm opt is a better idea. This requires us to enable opt in compiler package, opt tool is not generated in SYCL compiler package currently.
However, existing llvm "GlobalDCE" pass can't meet our requirements although it is much more complicated. The reason is existing llvm "GlobalDCE" has different pre-assumption from SYCL device code.
I dumped device code (LLVM IR .bc format) with all device libraries being linked in and used following command to try to eliminate "dead" code:
opt -globaldce redudant.bc -o reduced.bc
However, the redundant functions from device libraries were not eliminated at all. The reason is all "external" linkage functions who has a function body in current module will be marked as "Alive" and won't be eliminated: https://github.com/intel/llvm/blob/sycl/llvm/lib/Transforms/IPO/GlobalDCE.cpp#L321
The functions from device libraries are "external" and have function body, so they will survive in "GlobalDCE".
The pre-assumption "GlobalDCE" is based on seems to make sense as the "GlobalDCE" pass doesn't know whether current module is completely linked or not, the exported functions may be used if current module linked with other modules later. In our scenario, "DeadFunctionElimination" will work after llvm-link finishes linking everything including device libraries and its input is a completely linked device image(In spv online link mode, some "_devicelib* references will be handled by sycl runtime") which means all SYCL functions who has a function body will not be used by other modules later. And the logic of "DeadFunctionsElimination" is much simpler than existing "GlobalDCE", we only need to go through all spirv functions and if the function doesn't have any "user" in current module, we can remove it. The only exception are functions with "referenced-indirectly" attribute, we need to keep them to support device function pointers.
Existing llvm "GlobalDCE" also includes something which are not necessary in SYCL device scenario such as handling virtual functions, scanning vtable load, handling ifuncs...
So, in order to enable "DeadFunctionElimination" in sycl toolchain, we may have following ways:
In 3 and 4, we need to enable opt after completely linked device image is generate by llvm-link.
Which one do you prefer or do you have any other good ideas?
Thanks very much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to just have a small pass cleaning up the code first before using plain GlobalDCE?
Similar to for example https://github.com/triSYCL/llvm/blob/sycl/master/lib/SYCL/SYCLKernelFilter.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @keryell
It seems to be a different scenario. In compilation phase, we can run "GlobalDCE" to each module no matter it is "completely linked" or not.
"DeadFunctionElimination" introduced by this patch aims to eliminate SPIRV functions which are not called in device code(mainly for cleaning up unused functions introduced by device libraries), so its input must be a "completely linked" device image. We can only get such "completely linked" device image when llvm-link finishes linking everything.
But I think moving the functionality to a small SYCL pass should be the right direction. At least, we can make sycl-post-link.cpp more clean by doing so.
Thanks very much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @bader here. Originally the post-link tool's purpose was to handle device code splitting. Then spec constants handling was added as it depended on the splitting. Now it turns more and more into an LTO driver. This is not the purpose of the tool.
I'm not sure if we can or it makes sense to reuse the LLVM LTO infra for our device code - maybe @andykaylor has a perspective. To me using
opt
-based Driver job to run necessary LTO passes like the GlobalDCE before handing off the optimized IR to SYCL post-link is the right way to go at this point. Later we can decide to reuse LTO infra and adjust accordingly. Linking with default device libraries should also be factored out of the post-link into a driver job, IMO.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @kbobrovs
Thank you for suggestion.
Currently, we have enabled functionality to analyze device code to get required device libraries(spv file) in sycl-post-link tool too which is not the purpose of sycl-post-link tool either.
I am not sure whether we can enable llvm "opt" tools in device compilation workflow, if "opt" can be enabled, we can put "dead function elimination" or "device code device library requirement analysis" into llvm passes and run them via opt offline. This will require some driver work, we need to add "opt" in driver workflow but will make sycl-post-link clean.