Skip to content

[Proposal] Merge mlir::cir namespace into cir #1025

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
smeenai opened this issue Oct 30, 2024 · 7 comments
Closed

[Proposal] Merge mlir::cir namespace into cir #1025

smeenai opened this issue Oct 30, 2024 · 7 comments

Comments

@smeenai
Copy link
Collaborator

smeenai commented Oct 30, 2024

We currently define the CIR dialect under the mlir::cir namespace, which is a historical leftover from when CIR lived under MLIR instead of Clang (at the very start of the project). We also have a separate cir namespace to hold e.g. our lowering patterns. This is unfortunate, because in many source files you have

using namespace cir;
using namespace mlir;
using namespace mlir::cir;

And then cir:: becomes ambiguous between ::cir and mlir::cir, leading to various errors. You can leave off the using namespace, but typing out mlir::cir::FooOp everywhere is annoying and adds a bunch of noise IMO.

I'm proposing that we merge mlir::cir into cir instead. This matches Flang, which uses the fir namespace for its IR (and the hlfir namespace for its higher-level IR). It's also an easy change to perform mechanically, and cir:: is a small enough prefix that I'm happy to go either way on the using namespace afterwards. (Flang does a mix, but it seems to prefer spelling out fir:: explicitly.)

A small complication is that some names are common between the cir and mlir::cir namespaces, namely ABIInfo, RequiredArgs, and ReturnValueSlot. This is kinda surprising to me, and it seems like a good opportunity to disambiguate them instead of anything that should block the namespace merge.

As a follow-up, these symbols that live in cir have their corresponding symbols live under clang::CodeGen: https://gist.github.com/smeenai/8438fd01588109fcdbde5c8652781dc0. We should also probably move them to a corresponding namespace (e.g. clang::CIRGen) to match the original CodeGen better. That'll be less mechanical though, so I want it to be a separate step from the mlir::cir merging.

Some references, based on a rudimentary analysis of the demangled symbols from a CIR-enabled clang and flang-new:

CCing some people who might have opinions on this: @bcardosolopes, @lanza, @dkolsen-pgi, @keryell, @Lancern

@bcardosolopes
Copy link
Member

Thanks for putting this together @smeenai, this is a real annoying problem we need to solve.

I'm fine with ::mlir::cir:: becoming ::cir, but I'm confused on how you plan to later tackle to move things from the current ::cir to a said future clang::cirgen if you are merging all of it in ::cir first. Not clear how you'd go about cherry-picking the CIRGen pieces.

I personally thing that the current ::cir is the worse problem here, existing ::mlir::cir doesn't really bother me. So even if they are tackled in different PRs, it might be useful to have a full plan already.

@smeenai
Copy link
Collaborator Author

smeenai commented Oct 30, 2024

@bcardosolopes, by my rough analysis, ::cir has 63 symbols that should move into a future ::clang::CIRGen: https://gist.github.com/smeenai/8438fd01588109fcdbde5c8652781dc0. It has 158 symbols in total (https://gist.github.com/smeenai/e6751324254716b673de84cf2533038f), and the remaining ones seem appropriate for the namespace, e.g. the lowering classes. If you think we should split some of those into other namespaces too, I'm happy to do that. All of that splitting will need to be done by hand though (at least partially), so I don't think it makes a difference if we merge ::mlir::cir into ::cir before or after that, since we'd have to find all those symbols and manually move them into the right symbol in either case. Let me know if I misunderstood what you meant though.

One problem though is that we have three symbols duplicated between ::cir and ::mir:cir:

The cir ones are all in clang/lib/CIR/CodeGen, and the mlir::cir ones are all in clang/lib/CIR/Dialect/Transforms/TargetLowering. I assume the duplication is because of build dependency graph issues, e.g. maybe we didn't want TargetLowering to depend on clangCIR? There might also be some differences in how the classes operate, I haven't looked too closely. My initial plan was to just prefix the TargetLowering class names with TargetLowering to disambiguate, unless you feel strongly that we should deduplicate first.

@bcardosolopes
Copy link
Member

bcardosolopes commented Oct 30, 2024

If you think we should split some of those into other namespaces too

I'm good with only two, my suggestion is to keep all lowercase

Let me know if I misunderstood what you meant though.

I'm just worried that things that were never in mlir::cir may now land/stay in the new ::cir, because you are merging all over there before splitting. But if you are sure this won't happen when doing the manual work, fine by me.

... One problem though is that we have three symbols duplicated

filed another issue to cover that

@smeenai
Copy link
Collaborator Author

smeenai commented Oct 30, 2024

If you think we should split some of those into other namespaces too

I'm good with only two, my suggestion is to keep all lowercase

As in clang::cirgen instead of clang::CIRGen? Works for me, I'd only suggested the second one to try to match the existing clang::CodeGen namespace.

Let me know if I misunderstood what you meant though.

I'm just worried that things that were never in mlir::cir may now land/stay in the new ::cir, because you are merging all over there before splitting. But if you are sure this won't happen when doing the manual work, fine by me.

I don't think anything that's not in mlir::cir should end up in ::cir, unless I mess up the change :D The only thing I'm planning to do is take all existing things inside mlir::cir and put then into ::cir (and remove mlir::cir entirely); nothing else should end up in ::cir as a result.

Edit: I think I understand your concern now; it's about future additions to CIRGen ending up in the wrong namespace, not about the current state of the world. That's valid, and I hadn't considered it. I'll see how much work it is to fix the ::cir namespace first then.

@keryell
Copy link
Collaborator

keryell commented Oct 31, 2024

Thanks for this great analysis!
Having everything in a single cir:: namespace will avoid the blast effect we have now when using a

#include "clang/CIR/MissingFeatures.h"

I have noticed that we would still have some cir::CIR redundancy in some naming, like cir::CIRGenVTables which almost not present if fir:: and not at all in clang::.
If we remove it with cir::GenVTables it saves 3 letters and it is even more compelling to drop the using namespace cir. 😄
I find using namespace a cognitive burden preventing inclusion of new people or people context-switching between different projects.

@bcardosolopes
Copy link
Member

As in clang::cirgen instead of clang::CIRGen? Works for me, I'd only suggested the second one to try to match the existing clang::CodeGen namespace.

I kinda feel like in the end this is potential bikeshed for upstreaming, so probably my opinion doesn't matter as much hehe!

@smeenai
Copy link
Collaborator Author

smeenai commented Nov 4, 2024

Thanks for this great analysis! Having everything in a single cir:: namespace will avoid the blast effect we have now when using a

#include "clang/CIR/MissingFeatures.h"

I have noticed that we would still have some cir::CIR redundancy in some naming, like cir::CIRGenVTables which almost not present if fir:: and not at all in clang::. If we remove it with cir::GenVTables it saves 3 letters and it is even more compelling to drop the using namespace cir. 😄 I find using namespace a cognitive burden preventing inclusion of new people or people context-switching between different projects.

I agree with you about the redundancy and removing all temptation for using namespace :) There's also the standard tension of how much we want to match original CodeGen though, e.g. it has clang::CodeGen::CodeGenVTables, which the CIRGen prefix (and potential future clang::CIRGen sub-namespace) are trying to match.

smeenai added a commit to llvm/llvm-project that referenced this issue Nov 7, 2024
llvm/clangir#1025 discusses the motivation.
The mechanical parts of this change were done via:
    find clang \( -name '*.h' -o -name '*.cpp' -o -name '*.td' \) -print0 | xargs -0 perl -pi -e 's/mlir::cir/cir/g'
    find clang \( -name '*.h' -o -name '*.cpp' \) -print0 | xargs -0 perl -pi -e 's/::cir/cir/g'

There were some manual fixups and a clang-format run afterwards.

ghstack-source-id: fe96b10e812e6316d15e2a7864131f1e7ac60c1c
ghstack-comment-id: 2463364322
Pull Request resolved: #115386
jollaitbot pushed a commit to sailfishos-mirror/llvm-project that referenced this issue Nov 8, 2024
llvm/clangir#1025 explains why we want to move
the CIR dialect from the `mlir::cir` to the `cir` namespace. To avoid
overloading the `cir` namespace too much afterwards, move all symbols
whose equivalents live inside the `clang::CodeGen` namespace to a new
`clang::CIRGen` namespace, so that we match the original CodeGen's
structure more closely.

ghstack-source-id: 62666dcd61983168bde3d3c5bcc0bf82a0ec1f64
ghstack-comment-id: 2463364244
Pull Request resolved: llvm/llvm-project#115385
smeenai added a commit to llvm/llvm-project that referenced this issue Nov 8, 2024
llvm/clangir#1025 explains why we want to move
the CIR dialect from the `mlir::cir` to the `cir` namespace. To avoid
overloading the `cir` namespace too much afterwards, move all symbols
whose equivalents live inside the `clang::CodeGen` namespace to a new
`clang::CIRGen` namespace, so that we match the original CodeGen's
structure more closely.
smeenai added a commit to llvm/llvm-project that referenced this issue Nov 8, 2024
llvm/clangir#1025 discusses the motivation.
The mechanical parts of this change were done via:

find clang \( -name '*.h' -o -name '*.cpp' -o -name '*.td' \) -print0 |
xargs -0 perl -pi -e 's/mlir::cir/cir/g'
find clang \( -name '*.h' -o -name '*.cpp' \) -print0 | xargs -0 perl
-pi -e 's/::cir/cir/g'

There were some manual fixups and a clang-format run afterwards.
smeenai added a commit that referenced this issue Nov 8, 2024
#1025 explains why we want to move
the CIR dialect from the `mlir::cir` to the `cir` namespace. To avoid
overloading the `cir` namespace too much afterwards, move all symbols
whose equivalents live inside the `clang::CodeGen` namespace to a new
`clang::CIRGen` namespace, so that we match the original CodeGen's
structure more closely.

There's some symbols that live under `clang/include/clang/CIR` whose
equivalents live in `clang/lib/CodeGen` and are in the `clang::CodeGen`
namespace. We have these symbols in a common location since they're also
used by lowering, so I've also left them in the `cir` namespace. Those
symbols are:
- AArch64ABIKind
- ABIArgInfo
- FnInfoOpts
- TypeEvaluationKind
- X86AVXABILevel

This is a pretty large PR out of necessity. To make it slightly more
reviewable, I've split it out into three commits (which will be squashed
together when the PR lands):
- The first commit manually switches places to the `clang::CIRGen`
  namespace. This has to be manual because we only want to move things
  selectively.
- The second commit adjusts namespace prefixes to make builds work. I
  ran https://gist.github.com/smeenai/f4dd441fb61c53e835c4e6057f8c322f
  to make this change. The script is idempotent, and I added
  substitutions one at a time and reviewed each one afterwards (using
  `git diff --color-words=.`) to ensure only intended changes were being
  made.
- The third commit runs `git clang-format`.

Because I went one-by-one with all my substitutions and checked each one
afterwards, I'm pretty confident in the validity of all the changes
(despite the size of the PR).
smeenai added a commit that referenced this issue Nov 8, 2024
#1025 explains why we want to move
the CIR dialect from the `mlir::cir` to the `cir` namespace.

This is a large PR, and I've split it out into four commits (that'll be
squashed when landing). The first commit changes `mlir::cir` to `cir`
everywhere. This was originally done mechanically with:

```
find clang \( -name '*.h' -o -name '*.cpp' -o -name '*.td' \) -print0 | xargs -0 perl -pi -e 's/mlir::cir/cir/g'
find clang \( -name '*.h' -o -name '*.cpp' \) -print0 | xargs -0 perl -pi -e 's/::cir/cir/g'
find clang \( -name '*.h' -o -name '*.cpp' \) -print0 | xargs -0 perl -0777 -pi -e 's/namespace mlir \{\nnamespace cir \{/namespace cir {/g'
find clang \( -name '*.h' -o -name '*.cpp' \) -print0 | xargs -0 perl -0777 -pi -e 's!\} // namespace cir\n\} // namespace mlir!} // namespace cir!g'
```

It then required some manual fixups to address edge cases.

Code that lived under `mlir::cir` could refer to the `mlir` namespace
without qualification, but after the namespace change, we need to
explicitly qualify all our usages. This is done in the second commit via
https://gist.github.com/smeenai/996200fd45ad123bbf22b412d59479b6, which
is an idempotent script to add all qualifications. I added cases to the
script one at a time and reviewed each change afterwards to ensure we
were only making the intended modifications, so I feel pretty confident
in the end result. I also removed `using namespace llvm` from some
headers to avoid conflicts, which in turn required adding some `llvm::`
qualifiers as well.

The third commit fixes a test, since an error message now contains the
mlir namespace. Similar tests in flang also have the namespace in their
error messages, so this is an expected change.

The fourth change runs `git clang-format`. Unfortunately, that doesn't
work for TableGen files, so we'll have a few instances of undesirable
formatting left there. I'll look into fixing that as a follow-up.

I validated the end result by examining the symbols in the built Clang
binary. There's nothing in the `mlir::cir` namespace anymore.
https://gist.github.com/smeenai/8438fd01588109fcdbde5c8652781dc0 had the
symbols which lived in `cir` and should have moved to `clang::CIRGen`,
and I validated that all the symbols were moved, with the exceptions
noted in #1082 and the duplicated
symbols noted in #1025.
@smeenai smeenai closed this as completed Nov 14, 2024
Groverkss pushed a commit to iree-org/llvm-project that referenced this issue Nov 15, 2024
llvm/clangir#1025 explains why we want to move
the CIR dialect from the `mlir::cir` to the `cir` namespace. To avoid
overloading the `cir` namespace too much afterwards, move all symbols
whose equivalents live inside the `clang::CodeGen` namespace to a new
`clang::CIRGen` namespace, so that we match the original CodeGen's
structure more closely.
Groverkss pushed a commit to iree-org/llvm-project that referenced this issue Nov 15, 2024
llvm/clangir#1025 discusses the motivation.
The mechanical parts of this change were done via:

find clang \( -name '*.h' -o -name '*.cpp' -o -name '*.td' \) -print0 |
xargs -0 perl -pi -e 's/mlir::cir/cir/g'
find clang \( -name '*.h' -o -name '*.cpp' \) -print0 | xargs -0 perl
-pi -e 's/::cir/cir/g'

There were some manual fixups and a clang-format run afterwards.
lanza pushed a commit that referenced this issue Mar 18, 2025
#1025 explains why we want to move
the CIR dialect from the `mlir::cir` to the `cir` namespace. To avoid
overloading the `cir` namespace too much afterwards, move all symbols
whose equivalents live inside the `clang::CodeGen` namespace to a new
`clang::CIRGen` namespace, so that we match the original CodeGen's
structure more closely.

There's some symbols that live under `clang/include/clang/CIR` whose
equivalents live in `clang/lib/CodeGen` and are in the `clang::CodeGen`
namespace. We have these symbols in a common location since they're also
used by lowering, so I've also left them in the `cir` namespace. Those
symbols are:
- AArch64ABIKind
- ABIArgInfo
- FnInfoOpts
- TypeEvaluationKind
- X86AVXABILevel

This is a pretty large PR out of necessity. To make it slightly more
reviewable, I've split it out into three commits (which will be squashed
together when the PR lands):
- The first commit manually switches places to the `clang::CIRGen`
  namespace. This has to be manual because we only want to move things
  selectively.
- The second commit adjusts namespace prefixes to make builds work. I
  ran https://gist.github.com/smeenai/f4dd441fb61c53e835c4e6057f8c322f
  to make this change. The script is idempotent, and I added
  substitutions one at a time and reviewed each one afterwards (using
  `git diff --color-words=.`) to ensure only intended changes were being
  made.
- The third commit runs `git clang-format`.

Because I went one-by-one with all my substitutions and checked each one
afterwards, I'm pretty confident in the validity of all the changes
(despite the size of the PR).
lanza pushed a commit that referenced this issue Mar 18, 2025
#1025 explains why we want to move
the CIR dialect from the `mlir::cir` to the `cir` namespace.

This is a large PR, and I've split it out into four commits (that'll be
squashed when landing). The first commit changes `mlir::cir` to `cir`
everywhere. This was originally done mechanically with:

```
find clang \( -name '*.h' -o -name '*.cpp' -o -name '*.td' \) -print0 | xargs -0 perl -pi -e 's/mlir::cir/cir/g'
find clang \( -name '*.h' -o -name '*.cpp' \) -print0 | xargs -0 perl -pi -e 's/::cir/cir/g'
find clang \( -name '*.h' -o -name '*.cpp' \) -print0 | xargs -0 perl -0777 -pi -e 's/namespace mlir \{\nnamespace cir \{/namespace cir {/g'
find clang \( -name '*.h' -o -name '*.cpp' \) -print0 | xargs -0 perl -0777 -pi -e 's!\} // namespace cir\n\} // namespace mlir!} // namespace cir!g'
```

It then required some manual fixups to address edge cases.

Code that lived under `mlir::cir` could refer to the `mlir` namespace
without qualification, but after the namespace change, we need to
explicitly qualify all our usages. This is done in the second commit via
https://gist.github.com/smeenai/996200fd45ad123bbf22b412d59479b6, which
is an idempotent script to add all qualifications. I added cases to the
script one at a time and reviewed each change afterwards to ensure we
were only making the intended modifications, so I feel pretty confident
in the end result. I also removed `using namespace llvm` from some
headers to avoid conflicts, which in turn required adding some `llvm::`
qualifiers as well.

The third commit fixes a test, since an error message now contains the
mlir namespace. Similar tests in flang also have the namespace in their
error messages, so this is an expected change.

The fourth change runs `git clang-format`. Unfortunately, that doesn't
work for TableGen files, so we'll have a few instances of undesirable
formatting left there. I'll look into fixing that as a follow-up.

I validated the end result by examining the symbols in the built Clang
binary. There's nothing in the `mlir::cir` namespace anymore.
https://gist.github.com/smeenai/8438fd01588109fcdbde5c8652781dc0 had the
symbols which lived in `cir` and should have moved to `clang::CIRGen`,
and I validated that all the symbols were moved, with the exceptions
noted in #1082 and the duplicated
symbols noted in #1025.
xlauko pushed a commit to trailofbits/instafix-llvm that referenced this issue Mar 28, 2025
llvm/clangir#1025 explains why we want to move
the CIR dialect from the `mlir::cir` to the `cir` namespace. To avoid
overloading the `cir` namespace too much afterwards, move all symbols
whose equivalents live inside the `clang::CodeGen` namespace to a new
`clang::CIRGen` namespace, so that we match the original CodeGen's
structure more closely.

There's some symbols that live under `clang/include/clang/CIR` whose
equivalents live in `clang/lib/CodeGen` and are in the `clang::CodeGen`
namespace. We have these symbols in a common location since they're also
used by lowering, so I've also left them in the `cir` namespace. Those
symbols are:
- AArch64ABIKind
- ABIArgInfo
- FnInfoOpts
- TypeEvaluationKind
- X86AVXABILevel

This is a pretty large PR out of necessity. To make it slightly more
reviewable, I've split it out into three commits (which will be squashed
together when the PR lands):
- The first commit manually switches places to the `clang::CIRGen`
  namespace. This has to be manual because we only want to move things
  selectively.
- The second commit adjusts namespace prefixes to make builds work. I
  ran https://gist.github.com/smeenai/f4dd441fb61c53e835c4e6057f8c322f
  to make this change. The script is idempotent, and I added
  substitutions one at a time and reviewed each one afterwards (using
  `git diff --color-words=.`) to ensure only intended changes were being
  made.
- The third commit runs `git clang-format`.

Because I went one-by-one with all my substitutions and checked each one
afterwards, I'm pretty confident in the validity of all the changes
(despite the size of the PR).
xlauko pushed a commit to trailofbits/instafix-llvm that referenced this issue Mar 28, 2025
llvm/clangir#1025 explains why we want to move
the CIR dialect from the `mlir::cir` to the `cir` namespace.

This is a large PR, and I've split it out into four commits (that'll be
squashed when landing). The first commit changes `mlir::cir` to `cir`
everywhere. This was originally done mechanically with:

```
find clang \( -name '*.h' -o -name '*.cpp' -o -name '*.td' \) -print0 | xargs -0 perl -pi -e 's/mlir::cir/cir/g'
find clang \( -name '*.h' -o -name '*.cpp' \) -print0 | xargs -0 perl -pi -e 's/::cir/cir/g'
find clang \( -name '*.h' -o -name '*.cpp' \) -print0 | xargs -0 perl -0777 -pi -e 's/namespace mlir \{\nnamespace cir \{/namespace cir {/g'
find clang \( -name '*.h' -o -name '*.cpp' \) -print0 | xargs -0 perl -0777 -pi -e 's!\} // namespace cir\n\} // namespace mlir!} // namespace cir!g'
```

It then required some manual fixups to address edge cases.

Code that lived under `mlir::cir` could refer to the `mlir` namespace
without qualification, but after the namespace change, we need to
explicitly qualify all our usages. This is done in the second commit via
https://gist.github.com/smeenai/996200fd45ad123bbf22b412d59479b6, which
is an idempotent script to add all qualifications. I added cases to the
script one at a time and reviewed each change afterwards to ensure we
were only making the intended modifications, so I feel pretty confident
in the end result. I also removed `using namespace llvm` from some
headers to avoid conflicts, which in turn required adding some `llvm::`
qualifiers as well.

The third commit fixes a test, since an error message now contains the
mlir namespace. Similar tests in flang also have the namespace in their
error messages, so this is an expected change.

The fourth change runs `git clang-format`. Unfortunately, that doesn't
work for TableGen files, so we'll have a few instances of undesirable
formatting left there. I'll look into fixing that as a follow-up.

I validated the end result by examining the symbols in the built Clang
binary. There's nothing in the `mlir::cir` namespace anymore.
https://gist.github.com/smeenai/8438fd01588109fcdbde5c8652781dc0 had the
symbols which lived in `cir` and should have moved to `clang::CIRGen`,
and I validated that all the symbols were moved, with the exceptions
noted in llvm/clangir#1082 and the duplicated
symbols noted in llvm/clangir#1025.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants