-
Notifications
You must be signed in to change notification settings - Fork 138
[CIR][WIP] Add ABI lowering pass #1471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…e BaseClassAddrOp
It was always the intention for `cir.cmp` operations to return bool result. Due to missing constraints, a bug in codegen has slipped in which created `cir.cmp` operations with result type that matches the original AST expression type. In C, as opposed to C++, boolean expression types are "int". This resulted with extra operations being codegened around boolean expressions and their usage. This commit both enforces `cir.cmp` in the op definition and fixes the mentioned bug.
This is the first patch to support TBAA, following the discussion at llvm#1076 (comment) - add skeleton for CIRGen, utilizing `decorateOperationWithTBAA` - add empty implementation in `CIRGenTBAA` - introduce `CIR_TBAAAttr` with empty body - attach `CIR_TBAAAttr` to `LoadOp` and `StoreOp` - no handling of vtable pointer - no LLVM lowering
) The title describes the purpose of the PR. It adds initial support for structures with padding to the call convention lowering for AArch64. I have also _initial support_ for the missing feature [FinishLayout](https://github.com/llvm/clangir/blob/5c5d58402bebdb1e851fb055f746662d4e7eb586/clang/lib/AST/RecordLayoutBuilder.cpp#L786) for records, and the logic is gotten from the original codegen. Finally, I added a test for verification.
…m#1152) The function `populateCIRToLLVMConversionPatterns` contains a spaghetti of LLVM dialect conversion patterns, which results in merge conflicts very easily. Besides, a few patterns are even registered for more than once, possibly due to careless resolution of merge conflicts. This PR attempts to mitigate this problem. Pattern names now are sorted in alphabetical order, and each source code line now only lists exactly one pattern name to reduce potential merge conflicts.
…ltinExpr (llvm#1133) This PR is a NFC as we just NYI every builtID of neon SISD. We will implement them in subsequent PRs.
…CFG (llvm#1147) This PR implements NYI in CIRScopeOpFlattening. It seems to me the best way is to let results of ScopeOp forwarded as block arguments of the last block split from the cir.scope block.
…der (llvm#1157) With [llvm-project#116090](llvm/llvm-project#116090) merged, we can get rid of `#include "../../../../Basic/Targets.h"` now.
This PR moves the lowering code for data member pointers from the conversion patterns to the implementation of CXXABI because this part should be ABI-specific.
…ant arrays or pointers (llvm#1136) This PR adds support for function arguments with structs that contain constant arrays or pointers for AArch64. For example, ``` typedef struct { int a[42]; } CAT; void pass_cat(CAT a) {} ``` As usual, the main ideas are gotten from the original [CodeGen](https://github.com/llvm/clangir/blob/3aed38cf52e72cb51a907fad9dd53802f6505b81/clang/lib/AST/ASTContext.cpp#L1823), and I have added a couple of tests.
…ge (64, 128) (llvm#1141) This PR adds support for the lowering of AArch64 calls with structs having sizes greater than 64 and less than 128. The idea is from the original [CodeGen](https://github.com/llvm/clangir/blob/da601b374deea6665f710f7e432dfa82f457059e/clang/lib/CodeGen/CGCall.cpp#L1329), where we perform a coercion through memory for these type of calls. I have added a test for this.
Recommit llvm#1101 I am not sure what happened. But that merged PR doesn't show in the git log. Maybe the stacked PR may not get successed? But after all, we need to land it again. Following off are original commit messages: --- This is the following of llvm#1100. After llvm#1100, when we want to use LongDouble for VAArg, we will be in trouble due to details in X86_64's ABI and this patch tries to address this. The practical impact the patch is, after this patch, with llvm#1088 and a small following up fix, we can build and run all C's benchmark in SpecCPU 2017. I think it is a milestone.
…vm#1151) They are rounding shift of vectors, and shift amount is from the least significant byte of the corresponding element of the second input vector. Thus, it is implemented in [its own ASM ](https://godbolt.org/z/v65sbeKaW). These make them not suitable to be lowered to CIR ShiftOp though it supports vector type now.
Caught by ASAN. We were creating a reference to an object going out of scope. Incidentally, this feels like the sort of issue the lifetime checker will be great for detecting :)
…ut` (llvm#1156) Since LLVM specific data layout string is not proper in ClangIR, this PR replaces it with existing MLIR DLTI equivalent and eliminate the redundancy. Although the constructor of `LowerModule` of TargetLowering library requires a llvm data layout string, it is not used currently. (I believe it would also not matter in the future.) Therefore, this PR has no functional change.
Based on https://mlir.llvm.org/docs/Tutorials/UnderstandingTheIRStructure/#traversing-the-def-use-chains, the users of an op are linked together in a doubly-linked list, so replacing a user while iterating the users causes a use-after-free. Store the users in a worklist and process them afterwards instead to avoid this.
We need to perform all erasures via the rewrite API instead of directly for the framework to work correctly. This was detected by a combination of `-DMLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS=ON` [1] and ASAN. [1] https://mlir.llvm.org/getting_started/Debugging/#detecting-invalid-api-usage
…#1081) Now implement the same as [OG](https://github.com/llvm/clangir/blob/7619b20d7461b2d46c17a3154ec4b2f12ca35ea5/clang/lib/CodeGen/CGBuiltin.cpp#L7886), which is to call llvm aarch64 intrinsic which would eventually become [an ARM64 instruction](https://developer.arm.com/documentation/ddi0596/2021-03/SIMD-FP-Instructions/ABS--Absolute-value--vector--?lang=en). However, clearly there is an alternative, which is to extend CIR::AbsOp and CIR::FAbsOp to support vector type and only lower it at LLVM Lowering stage to either [LLVM::FAbsOP ](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrfabs-llvmfabsop) or [[LLVM::AbsOP ]](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrabs-llvmabsop), provided LLVM dialect could do the right thing of TargetLowering by translating to llvm aarch64 intrinsic eventually. The question is whether it is worth doing it? Any way, put up this diff for suggestions and ideas.
…f -fstrict-vtable-pointers (llvm#1138) Without using flag `-fstrict-vtable-pointers`, `__builtin_launder` is a noop. This PR implements that, and leave implementation for the case of `-fstrict-vtable-pointers` to future where there is a need. This PR also adapted most of test cases from [OG test case](https://github.com/llvm/clangir/blob/3aed38cf52e72cb51a907fad9dd53802f6505b81/clang/test/CodeGenCXX/builtin-launder.cpp#L1). I didn't use test cases in the namespace [pessimizing_cases](https://github.com/llvm/clangir/blob/3aed38cf52e72cb51a907fad9dd53802f6505b81/clang/test/CodeGenCXX/builtin-launder.cpp#L269), as they have no difference even when `-fstrict-vtable-pointers` is on.
This PR also changed implementation of BI__builtin_neon_vshlq_v into using CIR ShiftOp
Cleans up default linkage query implementations. Removes duplicities from `extraClassDeclaration` that are now introduced through `CIRGlobalValueInterface`. This makes it more consistent with the `llvm::GlobalValue` methods.
CIR didn't work on structs with destructor but without constructor. Now it is fixed. Moreover, CUDA kernels must be emitted if it was referred to in the destructor of a non-device variable. It seems already working, so I just unblocked the code path.
…e.cpp (llvm#1426) Makes it consistent with C++ conventions and [OG](https://github.com/advay168/clangir/blob/436c635af6c7ec3d184a1f7e92a624acdf856991/clang/lib/CodeGen/CodeGenModule.cpp#L108).
This PR make CIR's AtomicFenceOp similar to MLIR LLVMIR's FenceOp. MLIR LLVMIR FenceOp: https://github.com/llvm/clangir/blob/0bedc285dc6fbd8486877887939c742c2ddaecfa/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td#L1925-L1947
Lower vcaged_f64
Currently, the following code snippet fails during CodeGen, using `clang++ tmp.cpp -fclangir -Xclang -emit-cir -S`: ``` #include <fstream> void foo(const char *path) { std::ofstream fout1(path); fout1 << path; std::ofstream fout2(path); fout2 << path; } ``` It fails with: ``` error: 'cir.yield' op expects parent op to be one of 'cir.if, cir.scope, cir.switch, cir.while, cir.for, cir.await, cir.ternary, cir.global, cir.do, cir.try, cir.array.ctor, cir.array.dtor, cir.call, cir.case' ``` The relevant part of the CIR dump before verification looks like: ``` "cir.br"()[^bb1] : () -> () ^bb1: // pred: ^bb0 "cir.yield"() : () -> () "cir.return"() : () -> () }) : () -> () ``` Two things are wrong: the YieldOp has `cir.func` as a parent and there is a `cir.return` too. These come right after the second destructor for `basic_ofstream`. This PR fixes this by checking if there is a terminator and removing (if it exists) before adding an implicit return. I have also added a test that mimics the behavior of `std::basic_ofstream`.
Currently `__shared__` and `__constant__` variables are ignored by CodeGen. This patch fixes this. (It is also fixed in llvm#1436 .) Device and constant variables should be marked as `externally_initialized`, as they might be initialized by host, rather than on device. We can't identify which variables are device ones at lowering stage, so this patch adds a new attribute for it in CodeGen. Similar to `__global__` functions, global variables on device corresponds to "shadow" variables on host, and they must be registered to their counterpart. I added a `CUDAShadowNameAttr` in this patch for later use, but I didn't insert code to actually generate it.
The generation is quite complicated so I plan to separate it into several parts. The registration function should be like: ```cpp const char *__cuda_fatbin_str = /* Raw content of file in -fcuda-include-gpubinary */; struct { int magicNum, version; void *binaryData, *unused; } __cuda_fatbin_wrapper = { /*CUDA Magic Num*/, 1, __cuda_fatbin_str, nullptr }; void __cuda_module_ctor() { handle = __cudaRegisterFatBinary(&wrapper); __cuda_register_globals(); } ``` In this PR, we generate everything except the `__cuda_register_globals` function. OG doesn't give a name to `__cuda_fatbin_str`, which isn't allowed for cir::GlobalOp, so I invented a name for it. Other names are kept consistent with OG.
This PR adds the flag `-emit-mlir-llvm` to allow emitting of MLIR in the LLVM dialect (cc @xlauko who asked me to do this). I'm not sure if the naming of the flag is the best and maybe someone will have a better idea. Another solution would be to make the `-emit-mlir` flag have a value, that specifies the target dialect (CIR/MLIR std dialects/LLVM Dialect).
GCC, unlike clang, issues a warning when one virtual function is overridden in a derived class but one or more other virtual functions with the same name and different signature from a base class are not overridden. This leads to many warnings in the MLIR and ClangIR code when using the OpenConversionPattern<>::matchAndRewrite() function in the ordinary way. The "hiding" behavior is what we want.
This resolve issue llvm#1442 .
Lower vcagtd_f64
Lower neon vaddlvq_s32
Lower neon vaddlv_u32
This patch introduces support for pointer TBAA, which can be enabled using the `-fpointer-tbaa` flag. By default, this feature is now enabled. To ensure test compatibility, the tests (`tbaa-enum.cpp`, `tbaa-enum.c`, and `tbaa-struct.cpp`) have been updated to include the `-fno-pointer-tbaa` flag. Related Pull Requests of OG: - llvm/llvm-project#76612 - llvm/llvm-project#116991
This patch adds a new pass cir-abi-lowering to the CIR dialect. This pass runs before the CallConvLowering pass, and it expands all ABI-dependent types and operations inside a function to their ABI-independent equivalences according to the ABI specification. This patch also moves the lowering code of the following types and operations from the LLVM lowering conversion to the new pass: - The pointer-to-data-member type `cir.data_member`; - The pointer-to-member-function type `cir.method`; - All operations working on operands of the above types.
The direct motivation for this new pass is the proper CallConvLowering of the cir.func @test(%arg0: !cir.method) {
// ...
} Following the current lowering approach, which keeps define dso_local @test({ i64, i64 } %0) {
; ...
} But we actually expect the following LLVM IR (note the differences on the function signatures): define dso_local @test(i64 %0, i64 %1) {
; ...
} To achieve this, I have 3 choices:
At the beginning I thought option 1 would be the easiest way. But as I dig through the rabbit hole I found some tricky stuff behind the scene. The problem comes from the CodeGen of function prologue and epilogue. In the prologue, each argument is assigned a stack slot and stored there. For an argument of type The problem of option 3 is that the LoweringPrepare pass is not a conversion pass, which could be really tricky if you want to do type conversion stuff in it. In my case I have to convert every appearances of Anyway, this PR is still very incomplete and under construction, I'd like to hear some early comments about this from the community. |
This PR attempts to add a new pass
cir-abi-lowering
to the CIR dialect. This pass runs before the CallConvLowering pass, and it expands all ABI-dependent types and operations inside a function to their ABI-independent equivalences according to the ABI specification.The patch also moves the lowering code of the following types and operations from the LLVM lowering conversion to the new pass:
cir.data_member
;cir.method
;