[CIR][WIP] Add ABI lowering pass #1471

Lancern · 2025-03-12T14:48:00Z

This PR attempts to add a new pass cir-abi-lowering to the CIR dialect. This pass runs before the CallConvLowering pass, and it expands all ABI-dependent types and operations inside a function to their ABI-independent equivalences according to the ABI specification.

The patch also moves the lowering code of the following types and operations from the LLVM lowering conversion to the new pass:

The pointer-to-data-member type cir.data_member;
The pointer-to-member-function type cir.method;
All operations working on operands of the above types.

…e BaseClassAddrOp

It was always the intention for `cir.cmp` operations to return bool result. Due to missing constraints, a bug in codegen has slipped in which created `cir.cmp` operations with result type that matches the original AST expression type. In C, as opposed to C++, boolean expression types are "int". This resulted with extra operations being codegened around boolean expressions and their usage. This commit both enforces `cir.cmp` in the op definition and fixes the mentioned bug.

This is the first patch to support TBAA, following the discussion at llvm#1076 (comment) - add skeleton for CIRGen, utilizing `decorateOperationWithTBAA` - add empty implementation in `CIRGenTBAA` - introduce `CIR_TBAAAttr` with empty body - attach `CIR_TBAAAttr` to `LoadOp` and `StoreOp` - no handling of vtable pointer - no LLVM lowering

) The title describes the purpose of the PR. It adds initial support for structures with padding to the call convention lowering for AArch64. I have also _initial support_ for the missing feature [FinishLayout](https://github.com/llvm/clangir/blob/5c5d58402bebdb1e851fb055f746662d4e7eb586/clang/lib/AST/RecordLayoutBuilder.cpp#L786) for records, and the logic is gotten from the original codegen. Finally, I added a test for verification.

…#1143)

…m#1152) The function `populateCIRToLLVMConversionPatterns` contains a spaghetti of LLVM dialect conversion patterns, which results in merge conflicts very easily. Besides, a few patterns are even registered for more than once, possibly due to careless resolution of merge conflicts. This PR attempts to mitigate this problem. Pattern names now are sorted in alphabetical order, and each source code line now only lists exactly one pattern name to reduce potential merge conflicts.

…#1132)

…ltinExpr (llvm#1133) This PR is a NFC as we just NYI every builtID of neon SISD. We will implement them in subsequent PRs.

…CFG (llvm#1147) This PR implements NYI in CIRScopeOpFlattening. It seems to me the best way is to let results of ScopeOp forwarded as block arguments of the last block split from the cir.scope block.

…der (llvm#1157) With [llvm-project#116090](llvm/llvm-project#116090) merged, we can get rid of `#include "../../../../Basic/Targets.h"` now.

This PR moves the lowering code for data member pointers from the conversion patterns to the implementation of CXXABI because this part should be ABI-specific.

…ant arrays or pointers (llvm#1136) This PR adds support for function arguments with structs that contain constant arrays or pointers for AArch64. For example, ``` typedef struct { int a[42]; } CAT; void pass_cat(CAT a) {} ``` As usual, the main ideas are gotten from the original [CodeGen](https://github.com/llvm/clangir/blob/3aed38cf52e72cb51a907fad9dd53802f6505b81/clang/lib/AST/ASTContext.cpp#L1823), and I have added a couple of tests.

…ge (64, 128) (llvm#1141) This PR adds support for the lowering of AArch64 calls with structs having sizes greater than 64 and less than 128. The idea is from the original [CodeGen](https://github.com/llvm/clangir/blob/da601b374deea6665f710f7e432dfa82f457059e/clang/lib/CodeGen/CGCall.cpp#L1329), where we perform a coercion through memory for these type of calls. I have added a test for this.

…m#1148)

Recommit llvm#1101 I am not sure what happened. But that merged PR doesn't show in the git log. Maybe the stacked PR may not get successed? But after all, we need to land it again. Following off are original commit messages: --- This is the following of llvm#1100. After llvm#1100, when we want to use LongDouble for VAArg, we will be in trouble due to details in X86_64's ABI and this patch tries to address this. The practical impact the patch is, after this patch, with llvm#1088 and a small following up fix, we can build and run all C's benchmark in SpecCPU 2017. I think it is a milestone.

…vm#1151) They are rounding shift of vectors, and shift amount is from the least significant byte of the corresponding element of the second input vector. Thus, it is implemented in [its own ASM ](https://godbolt.org/z/v65sbeKaW). These make them not suitable to be lowered to CIR ShiftOp though it supports vector type now.

Caught by ASAN. We were creating a reference to an object going out of scope. Incidentally, this feels like the sort of issue the lifetime checker will be great for detecting :)

…ut` (llvm#1156) Since LLVM specific data layout string is not proper in ClangIR, this PR replaces it with existing MLIR DLTI equivalent and eliminate the redundancy. Although the constructor of `LowerModule` of TargetLowering library requires a llvm data layout string, it is not used currently. (I believe it would also not matter in the future.) Therefore, this PR has no functional change.

Based on https://mlir.llvm.org/docs/Tutorials/UnderstandingTheIRStructure/#traversing-the-def-use-chains, the users of an op are linked together in a doubly-linked list, so replacing a user while iterating the users causes a use-after-free. Store the users in a worklist and process them afterwards instead to avoid this.

We need to perform all erasures via the rewrite API instead of directly for the framework to work correctly. This was detected by a combination of `-DMLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS=ON` [1] and ASAN. [1] https://mlir.llvm.org/getting_started/Debugging/#detecting-invalid-api-usage

…#1081) Now implement the same as [OG](https://github.com/llvm/clangir/blob/7619b20d7461b2d46c17a3154ec4b2f12ca35ea5/clang/lib/CodeGen/CGBuiltin.cpp#L7886), which is to call llvm aarch64 intrinsic which would eventually become [an ARM64 instruction](https://developer.arm.com/documentation/ddi0596/2021-03/SIMD-FP-Instructions/ABS--Absolute-value--vector--?lang=en). However, clearly there is an alternative, which is to extend CIR::AbsOp and CIR::FAbsOp to support vector type and only lower it at LLVM Lowering stage to either [LLVM::FAbsOP ](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrfabs-llvmfabsop) or [[LLVM::AbsOP ]](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrabs-llvmabsop), provided LLVM dialect could do the right thing of TargetLowering by translating to llvm aarch64 intrinsic eventually. The question is whether it is worth doing it? Any way, put up this diff for suggestions and ideas.

…f -fstrict-vtable-pointers (llvm#1138) Without using flag `-fstrict-vtable-pointers`, `__builtin_launder` is a noop. This PR implements that, and leave implementation for the case of `-fstrict-vtable-pointers` to future where there is a need. This PR also adapted most of test cases from [OG test case](https://github.com/llvm/clangir/blob/3aed38cf52e72cb51a907fad9dd53802f6505b81/clang/test/CodeGenCXX/builtin-launder.cpp#L1). I didn't use test cases in the namespace [pessimizing_cases](https://github.com/llvm/clangir/blob/3aed38cf52e72cb51a907fad9dd53802f6505b81/clang/test/CodeGenCXX/builtin-launder.cpp#L269), as they have no difference even when `-fstrict-vtable-pointers` is on.

…1165) If a record type contains an array of non-record types, we can generate a copy for it inside the copy constructor, as CodeGen does. CodeGen does so for arrays of record types where applicable as well, but we'll want to represent the construction of those explicitly, as outlined in llvm#1055.

… semantics of C++ initializer list (llvm#1121) I don't finish all work about `cir.initlist`. But I want to get some feedback about the cir design to make sure I am in correct way. Fixed: llvm#777

This PR also changed implementation of BI__builtin_neon_vshlq_v into using CIR ShiftOp

Cleans up default linkage query implementations. Removes duplicities from `extraClassDeclaration` that are now introduced through `CIRGlobalValueInterface`. This makes it more consistent with the `llvm::GlobalValue` methods.

… functions (llvm#1424) This PR adds CIRGen and LLVM lowering support for base-to-derived and derived-to-base cast operations on pointers to member functions. This PR includes a new operation `cir.update_member` to help the LLVM lowering procedure of such cast operations. Resolve llvm#973 .

CIR didn't work on structs with destructor but without constructor. Now it is fixed. Moreover, CUDA kernels must be emitted if it was referred to in the destructor of a non-device variable. It seems already working, so I just unblocked the code path.

…e.cpp (llvm#1426) Makes it consistent with C++ conventions and [OG](https://github.com/advay168/clangir/blob/436c635af6c7ec3d184a1f7e92a624acdf856991/clang/lib/CodeGen/CodeGenModule.cpp#L108).

This PR make CIR's AtomicFenceOp similar to MLIR LLVMIR's FenceOp. MLIR LLVMIR FenceOp: https://github.com/llvm/clangir/blob/0bedc285dc6fbd8486877887939c742c2ddaecfa/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td#L1925-L1947

Lower vcaged_f64

Similar to [OG](https://github.com/llvm/clangir/blob/8746bd4bbe777352c2935e9937449637a8943767/clang/lib/CodeGen/CodeGenModule.cpp#L219)

Currently, the following code snippet fails during CodeGen, using `clang++ tmp.cpp -fclangir -Xclang -emit-cir -S`: ``` #include <fstream> void foo(const char *path) { std::ofstream fout1(path); fout1 << path; std::ofstream fout2(path); fout2 << path; } ``` It fails with: ``` error: 'cir.yield' op expects parent op to be one of 'cir.if, cir.scope, cir.switch, cir.while, cir.for, cir.await, cir.ternary, cir.global, cir.do, cir.try, cir.array.ctor, cir.array.dtor, cir.call, cir.case' ``` The relevant part of the CIR dump before verification looks like: ``` "cir.br"()[^bb1] : () -> () ^bb1: // pred: ^bb0 "cir.yield"() : () -> () "cir.return"() : () -> () }) : () -> () ``` Two things are wrong: the YieldOp has `cir.func` as a parent and there is a `cir.return` too. These come right after the second destructor for `basic_ofstream`. This PR fixes this by checking if there is a terminator and removing (if it exists) before adding an implicit return. I have also added a test that mimics the behavior of `std::basic_ofstream`.

Currently `__shared__` and `__constant__` variables are ignored by CodeGen. This patch fixes this. (It is also fixed in llvm#1436 .) Device and constant variables should be marked as `externally_initialized`, as they might be initialized by host, rather than on device. We can't identify which variables are device ones at lowering stage, so this patch adds a new attribute for it in CodeGen. Similar to `__global__` functions, global variables on device corresponds to "shadow" variables on host, and they must be registered to their counterpart. I added a `CUDAShadowNameAttr` in this patch for later use, but I didn't insert code to actually generate it.

The generation is quite complicated so I plan to separate it into several parts. The registration function should be like: ```cpp const char *__cuda_fatbin_str = /* Raw content of file in -fcuda-include-gpubinary */; struct { int magicNum, version; void *binaryData, *unused; } __cuda_fatbin_wrapper = { /*CUDA Magic Num*/, 1, __cuda_fatbin_str, nullptr }; void __cuda_module_ctor() { handle = __cudaRegisterFatBinary(&wrapper); __cuda_register_globals(); } ``` In this PR, we generate everything except the `__cuda_register_globals` function. OG doesn't give a name to `__cuda_fatbin_str`, which isn't allowed for cir::GlobalOp, so I invented a name for it. Other names are kept consistent with OG.

@xlauko

This PR adds the flag `-emit-mlir-llvm` to allow emitting of MLIR in the LLVM dialect (cc @xlauko who asked me to do this). I'm not sure if the naming of the flag is the best and maybe someone will have a better idea. Another solution would be to make the `-emit-mlir` flag have a value, that specifies the target dialect (CIR/MLIR std dialects/LLVM Dialect).

GCC, unlike clang, issues a warning when one virtual function is overridden in a derived class but one or more other virtual functions with the same name and different signature from a base class are not overridden. This leads to many warnings in the MLIR and ClangIR code when using the OpenConversionPattern<>::matchAndRewrite() function in the ordinary way. The "hiding" behavior is what we want.

Close llvm#1234 Update test after fix in llvm#1422

This resolve issue llvm#1442 .

Lower vcagtd_f64

Lower neon vaddlvq_s32

Lower neon vaddlv_u32

This patch introduces support for pointer TBAA, which can be enabled using the `-fpointer-tbaa` flag. By default, this feature is now enabled. To ensure test compatibility, the tests (`tbaa-enum.cpp`, `tbaa-enum.c`, and `tbaa-struct.cpp`) have been updated to include the `-fno-pointer-tbaa` flag. Related Pull Requests of OG: - llvm/llvm-project#76612 - llvm/llvm-project#116991

Similar PR: llvm#1434

This patch adds a new pass cir-abi-lowering to the CIR dialect. This pass runs before the CallConvLowering pass, and it expands all ABI-dependent types and operations inside a function to their ABI-independent equivalences according to the ABI specification. This patch also moves the lowering code of the following types and operations from the LLVM lowering conversion to the new pass: - The pointer-to-data-member type `cir.data_member`; - The pointer-to-member-function type `cir.method`; - All operations working on operands of the above types.

Lancern · 2025-03-12T15:00:38Z

The direct motivation for this new pass is the proper CallConvLowering of the !cir.method type. Currently, this type is lowered during LLVM lowering, which is too late to achieve a proper lowering. Consider the following CIR code:

cir.func @test(%arg0: !cir.method) {
  // ...
}

Following the current lowering approach, which keeps !cir.method until we arrive at LLVM lowering, we would eventually get the following LLVM IR:

define dso_local @test({ i64, i64 } %0) {
  ; ...
}

But we actually expect the following LLVM IR (note the differences on the function signatures):

define dso_local @test(i64 %0, i64 %1) {
  ; ...
}

To achieve this, I have 3 choices:

Teach the CallConvLowering pass about the !cir.method type.
Move the lowering of !cir.method to the LoweringPrepare pass, which runs before CallConvLowering.
Add a new pass before CallConvLowering that lowers !cir.method to !cir.struct.

At the beginning I thought option 1 would be the easiest way. But as I dig through the rabbit hole I found some tricky stuff behind the scene. The problem comes from the CodeGen of function prologue and epilogue. In the prologue, each argument is assigned a stack slot and stored there. For an argument of type !cir.method, after CallConvLowering it would expands into two arguments of type !s64i. Thus in the function prologue I would have to come up a way to store two !s64i values into the stack slot allocated for a !cir.method value, which is tricky. Similar problems also exist in the epilogue.

The problem of option 3 is that the LoweringPrepare pass is not a conversion pass, which could be really tricky if you want to do type conversion stuff in it. In my case I have to convert every appearances of !cir.method to !cir.struct and this kind of job is better suited for a conversion pass.

Anyway, this PR is still very incomplete and under construction, I'd like to hear some early comments about this from the community.

bcardosolopes and others added 30 commits January 27, 2025 12:54

[CIR][CIRGen] Teach all uses of ApplyNonVirtualAndVirtualOffset to us…

38bb064

…e BaseClassAddrOp

[CIR][CIRGen] Support __builtin_memset_inline (llvm#1114)

bb076bf

[CIR] fix deref nullptr when verify symbol for cir.get_global (llvm…

479a4ab

…#1143)

[CIR][Dialect] Extend UnaryFPToFPBuiltinOp to vector of FP type (llvm…

1e3a8e3

…#1132)

[CIR][Builtin][Neon][NFC] Introduce skeleton of emitCommonNeonSISDBui…

e3d9fc3

…ltinExpr (llvm#1133) This PR is a NFC as we just NYI every builtID of neon SISD. We will implement them in subsequent PRs.

[CIR][CIRGen][Builtin][Neon] Lower neon_vqshrn_n_v (llvm#1144)

ab98b3c

[CIR][FlattenCFG] Let results of CIR ScopeOp forwarded during Flatten…

bf83e5d

…CFG (llvm#1147) This PR implements NYI in CIRScopeOpFlattening. It seems to me the best way is to let results of ScopeOp forwarded as block arguments of the last block split from the cir.scope block.

[CIR][Transforms][NFC] Fix undesirable include of clang's private hea…

8cb8d8d

…der (llvm#1157) With [llvm-project#116090](llvm/llvm-project#116090) merged, we can get rid of `#include "../../../../Basic/Targets.h"` now.

[CIR][NFC] move data member pointer lowering to CXXABI (llvm#1130)

486a298

This PR moves the lowering code for data member pointers from the conversion patterns to the implementation of CXXABI because this part should be ABI-specific.

[CIR][CIRGen][Builtin] Support __builtin_frame_address (llvm#1137)

b29c903

[CIR][CIRGen][Builtin] Support builtin __sync_sub_and_fetch (llvm#1146)

7377ae9

[CIR][CIRGen][Builtin][Neon] Lower neon_vpadal_v, neon_vpadalq_v (llv…

4baa37e

…m#1148)

[CIR][CIRGen] Fix a stack-use-after-free (llvm#1155)

f5084d2

Caught by ASAN. We were creating a reference to an object going out of scope. Incidentally, this feels like the sort of issue the lifetime checker will be great for detecting :)

[CIR][CIRGen][Builtin][Neon] Lower neon_vqrshrn_n_v (llvm#1161)

6d7d313

[CIR][Dialect] Introduce StdInitializerListOp to represent high-level…

292b4f0

… semantics of C++ initializer list (llvm#1121) I don't finish all work about `cir.initlist`. But I want to get some feedback about the cir design to make sure I am in correct way. Fixed: llvm#777

[CIR][CIRGen][Builtin][Neon] Lower __builtin_neon_vshl_v (llvm#1134)

2af9ae9

This PR also changed implementation of BI__builtin_neon_vshlq_v into using CIR ShiftOp

xlauko and others added 27 commits February 28, 2025 17:04

[CIR][CIRGen] Move CIRGenModule::getTargetCIRGenInfo() to CIRGenModul…

9ff80af

…e.cpp (llvm#1426) Makes it consistent with C++ conventions and [OG](https://github.com/advay168/clangir/blob/436c635af6c7ec3d184a1f7e92a624acdf856991/clang/lib/CodeGen/CodeGenModule.cpp#L108).

[CIR][CIRGen][Builtin][Neon] Lower vcaged_f64 (llvm#1432)

5a75305

Lower vcaged_f64

[CIR][Lowering][TBAA] distinct C and C++ (llvm#1406)

56d2262

[CIR][CUDA] Treat nvptx triple as alias for nvptx64 (llvm#1420)

d4fbab3

Similar to [OG](https://github.com/llvm/clangir/blob/8746bd4bbe777352c2935e9937449637a8943767/clang/lib/CodeGen/CodeGenModule.cpp#L219)

[CIR][CIRGen][TBAA] Add support for enum (llvm#1435)

d9eeb83

[CIR] Fix test to use -emit-mlir=core

7e36fad

[CIR] Fix multiple warnings across CIR codebase

f6f4799

[CIR] Remove -clangir-disable-passes from test

7f66a20

Close llvm#1234 Update test after fix in llvm#1422

[CIR][CIRGen][Neon] Make vrndns emit RoundEvenOp directly (llvm#1434)

f917f3b

[CIR][CIRGen][TBAA] Add support for BitInt (llvm#1443)

6593fbe

[CIR] Fix attributes lowering for GlobalOp (llvm#1447)

ad79ac5

This resolve issue llvm#1442 .

[CIR][CIRGen][Builtin][Neon] Lower vcagtd_f64 (llvm#1448)

8110774

Lower vcagtd_f64

[CIR][CIRGen][Builtin][Neon] Lower neon vaddlvq_s32 (llvm#1450)

8607c74

Lower neon vaddlvq_s32

[CIR][CIRGen][Builtin][Neon] Lower neon vaddlv_u32 (llvm#1451)

52fd4a2

Lower neon vaddlv_u32

[CIR][CIRGen][Neon] Make vrnda emit RoundOp directly (llvm#1453)

8883ebe

Similar PR: llvm#1434

[CIR][CUDA] Support for built-in CUDA surface type (llvm#1455)

a26ebb4

lanza force-pushed the main branch from 0364dd2 to 79d0d74 Compare March 18, 2025 02:22

lanza force-pushed the main branch from d3fe299 to 98e8811 Compare April 9, 2025 22:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CIR][WIP] Add ABI lowering pass #1471

[CIR][WIP] Add ABI lowering pass #1471

Lancern commented Mar 12, 2025

Lancern commented Mar 12, 2025

[CIR][WIP] Add ABI lowering pass #1471

Are you sure you want to change the base?

[CIR][WIP] Add ABI lowering pass #1471

Conversation

Lancern commented Mar 12, 2025

Lancern commented Mar 12, 2025