Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add clean up of applied CRDs after tests #2685

Merged
merged 1 commit into from
Feb 14, 2025

Conversation

xstefank
Copy link
Collaborator

Before:

$ mvn clean install -Pintegration-tests
...

$ kubectl get crds | wc -l
83

With this PR:

$ mvn clean install -Pintegration-tests
...

$ kubectl get crds | wc -l
No resources found
0

@csviri
Copy link
Collaborator

csviri commented Feb 13, 2025

@xstefank one of the E2E tests is failing (not flaky) could you pls take a look on that?

@xstefank xstefank force-pushed the delete-test-crds branch 2 times, most recently from e84c595 to f9c1e9b Compare February 13, 2025 16:08
try {
LOGGER.debug("Deleting CRD: {}", appliedCRD.crdString);
final var crd = client.load(new ByteArrayInputStream(appliedCRD.crdString.getBytes()));
crd.withTimeoutInMillis(50000000).delete();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a very long timeout… Do we need that much? Also maybe make it a constant, using underscores for readability?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OMG, thanks, forgot this from testing the issue we discussed.

Copy link
Collaborator

@csviri csviri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if this should be an opt-in feature, but no strong opinion.

@xstefank
Copy link
Collaborator Author

@csviri I also thought about it, so I can make that as a follow-up PR.

@metacosm
Copy link
Collaborator

metacosm commented Feb 14, 2025

Personally, if anything, I think it should be opt out. Tests should leave as little behind them as possible so this should be activated by default. I'm not even sure when you wouldn't want the CRDs removed after the tests…

@xstefank xstefank merged commit 1e3ac83 into operator-framework:main Feb 14, 2025
57 checks passed
@xstefank xstefank deleted the delete-test-crds branch February 14, 2025 14:00
@csviri
Copy link
Collaborator

csviri commented Feb 14, 2025

I'm not even sure when you wouldn't want the CRDs removed after the tests…

These tests are running on ephemeral environment so at the end does not matter that much.

On the other hand adding and removing CRDs in practice does not happen that dynamically actually. Usually when cluster is pinned up CRDs an controllers are added, and rarely removed at production. Maybe when an some alternative found for some controller. But think about operators as capabilities of a platform, you might deprecate a capability but especially on large platforms (with lots of teams) almost never removed.

So yeah, there are two ways to think about this, as a java developer or as a platform engineer :)

@metacosm
Copy link
Collaborator

I'm not even sure when you wouldn't want the CRDs removed after the tests…

These tests are running on ephemeral environment so at the end does not matter that much.

Says who? People might run these tests locally. I know I do when I see them failing in CI and having to clean up after them is tiring :)

On the other hand adding and removing CRDs in practice does not happen that dynamically actually. Usually when cluster is pinned up CRDs an controllers are added, and rarely removed at production. Maybe when an some alternative found for some controller. But think about operators as capabilities of a platform, you might deprecate a capability but especially on large platforms (with lots of teams) almost never removed.

So yeah, there are two ways to think about this, as a java developer or as a platform engineer :)

We're only talking about the test support here, though. So it's not a matter of how CRDs should be handled in production. That's a completely different story. Here, we simply want to not leave a mess after the tests are run, if they run on a local developer cluster, and not simply in CI.

@csviri
Copy link
Collaborator

csviri commented Feb 14, 2025

Says who? People might run these tests locally. I know I do when I see them failing in CI and having to clean up after them is tiring :)

Not sure if following, are you ever re-running on the same kind / minikube instance when tests failed? I mean when running the whole test suit? So basically the CRD should not matter event in that case.
But if something goes wrong you see the problem, a never try to cleanup things just run `kind delete cluster && kind create cluster' ant start with a fresh one.

But again I'm fine with this, if this helps anybody let's have it!
(Just had a feeling that I'm missing something, since never had a need for it :) )

@metacosm
Copy link
Collaborator

Says who? People might run these tests locally. I know I do when I see them failing in CI and having to clean up after them is tiring :)

Not sure if following, are you ever re-running on the same kind / minikube instance when tests failed? I mean when running the whole test suit? So basically the CRD should not matter event in that case. But if something goes wrong you see the problem, a never try to cleanup things just run `kind delete cluster && kind create cluster' ant start with a fresh one.

That's your use case, not mine. I don't use kind. So yes, I don't destroy my cluster after each test so I expect the tests to clean after themselves. That's just good practice.

But again I'm fine with this, if this helps anybody let's have it! (Just had a feeling that I'm missing something, since never had a need for it :) )

@xstefank
Copy link
Collaborator Author

Exactly what @metacosm said, tests should clean after themselves. I noticed this with minikube, which is not so fast to delete & restart. Of course, it could be cleaned manually, but that's precisely the point: tests should not leave any resources hanging in the void.

@csviri
Copy link
Collaborator

csviri commented Feb 17, 2025

Sure, as I said agree, until it helps anybody and does not have any side effects, we should have it. (Of course I agree with the principle that if possible tests should cleanup after themself).

First of just was bit curious, since for me did not cause any issues, that I assume because the nature how CRDs are defined in K8S. Also playing a bit a devils advocate, so we have here all the arguments document at least as a form of discussion. Sorry if it look like I challenge what I should not / looks obvious :)

@csviri
Copy link
Collaborator

csviri commented Feb 17, 2025

Also we could have this implemented also for ClusterDeployedOperatorExtension, so if we apply the CRD there, we might want to delete it also there.

@csviri
Copy link
Collaborator

csviri commented Feb 17, 2025

Why I also though it should be opt in, since it is a change in behavior, while API is backwards compatible the behavior changes, but on the other hand I assume it won't cause issue for the users, so fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants