-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Don't allow descriptionless assertion statements in production code #68616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging @elastic/es-delivery (Team:Delivery) |
I've added the |
Please don't do this. Assertions are extremely powerful pieces of documentation that make reading and reasoning about code so much easier. They protect against future changes that end up calling code in ways that the original author did not anticipate. If anything, IMO we should be finding ways to use assertions more, not discouraging their use. We use assertions to verify invariants of the system that would be too expensive to compute in production code, so simply replacing them with exceptions would not be feasible.
I'd rather not do this either. diff --git a/server/src/main/java/org/elasticsearch/cluster/coordination/CoordinationState.java b/server/src/main/java/org/elasticsearch/cluster/coordination/CoordinationState.java
index 51e634893be..dc0766f80eb 100644
--- a/server/src/main/java/org/elasticsearch/cluster/coordination/CoordinationState.java
+++ b/server/src/main/java/org/elasticsearch/cluster/coordination/CoordinationState.java
@@ -137,14 +137,14 @@ public class CoordinationState {
assert getLastCommittedConfiguration().isEmpty() : getLastCommittedConfiguration();
assert lastPublishedVersion == 0 : lastPublishedVersion;
assert lastPublishedConfiguration.isEmpty() : lastPublishedConfiguration;
- assert electionWon == false;
+ assert electionWon == false : "election already won";
assert joinVotes.isEmpty() : joinVotes;
assert publishVotes.isEmpty() : publishVotes;
assert initialState.term() == 0 : initialState + " should have term 0";
assert initialState.version() == getLastAcceptedVersion() : initialState + " should have version " + getLastAcceptedVersion();
- assert initialState.getLastAcceptedConfiguration().isEmpty() == false;
- assert initialState.getLastCommittedConfiguration().isEmpty() == false;
+ assert initialState.getLastAcceptedConfiguration().isEmpty() == false : "empty last-accepted config";
+ assert initialState.getLastCommittedConfiguration().isEmpty() == false : "empty last-committed config";
persistedState.setLastAcceptedState(initialState);
} I don't think the proposed extra ceremony adds any value over and above the line number on which the assertion tripped. |
I don't see how assertions are a better solution to this than throwing exceptions. Additionally exceptions can be better documented in the javadocs for particular apis. We don't need to make the checked exceptions so there's no additional load for callers.
In what way is this more expensive? Either way we are evaluating a boolean expression, then throwing an error which involves filling stacktraces, etc.
This assumes folks readily have the ES code available to them. When/if these assertions are tripped in production we should at least give folks looking at the logs an indication of what exploded, no? Having an error with a |
Perhaps it would help if you could share some more context around why you've raised this for discussion. There are no linked tickets or anything, but I feel you might be talking about a specific encounter with an uninformative
The two solutions express very different intents. Assertions say "this is always true, there are other mechanisms in the surrounding code that guarantee it, if it's false then that's definitely a bug, fail the test" whereas conditional exceptions say "this might legitimately be false; rejecting it is not necessarily bad, it's up to the caller to handle our rejection". Essentially nothing handles
We have much more complicated/expensive things that are only checked if assertions are enabled. For instance the machinery for catching concurrency bugs on the indexing path, based around things like this (which has caught bugs): elasticsearch/server/src/main/java/org/elasticsearch/index/engine/LiveVersionMap.java Lines 184 to 186 in 2d1e8b3
This is on the critical path for indexing. I don't think it's feasible to make these checks in production, but it would also be terrible to lose these checks entirely, or even to try and get similarly strong coverage by writing focussed tests instead. It's so much more powerful to have every test verify these things. This is a similar example: elasticsearch/server/src/main/java/org/elasticsearch/index/seqno/ReplicationTracker.java Lines 763 to 848 in 1b6ad96
We don't want to check all this jazz every time we take or drop that mutex in production, it should not be necessary since we're confident that our test coverage would catch a problem here before release, but we absolutely want these checks to happen in every test that does any sort of replication to give us that confidence. Also the boolean expression itself might be super-expensive to compute. For instance (this one has also caught bugs): elasticsearch/server/src/main/java/org/elasticsearch/cluster/coordination/Coordinator.java Lines 1115 to 1124 in 2d1e8b3
We could technically solve these things without assertions if we had some kind of global flag that is only enabled in tests that skips over the expensive test-only checks in production, and used these checks to throw an exception that nobody ever caught, and maybe even had some warnings in the IDE to make sure we didn't inadvertently change behaviour based on whether the checks were skipped or not, and also added compiler support to reduce the syntactic overhead of the checks, but we already have this functionality today (it's assertions) and if we reinvented it I don't see that we'd have made any progress on the problem that a failure might not contain all the information needed to diagnose what went wrong.
Wait, what? Surely we're only talking about tests? We absolutely shouldn't have folks running Elasticsearch in production with assertions enabled, not least because of the terrible performance consequences. We discussed adding a bootstrap check for this (#66271) which I argued against on the grounds that it would be trivial to bypass, but I could be persuaded if we're encountering production clusters with assertions enabled too often, and I would definitely change my position if the alternative is banning assertions altogether.
Debugging an |
I think I was working under the incorrect assumption that we enable assertions in prod, which we don't, so this makes sense. If the primary point of assertions is to catch when ES developers break certain assumptions in tests, that makes a lot more sense. Sorry for the misdirection here.
Yes, you are absolutely correct. Back to your original statement:
I think the trouble I have is that I'm used to a strict "no assertions in production code" due to the fact that enabling/disabling them obviously has behavioral consequences at runtime. Depending on how we are doing exception handling, it may be that even our tests are swallowing some |
Like @DaveCTurner, I think we should definitely keep using assertions. I'm However, we have a lot of branches, and line numbers change with some frequency, so if there are real cases where people are finding it hard to track down the cause of an assertion purely from a file + line number, then I'd be OK with requiring messages. |
Aha ok yes this makes a lot more sense now.
Indeed, this is a real concern, we have had bugs due to this (e.g. #29585). They're pretty rare tho, given how many assertions there are in the codebase and how few of them have meaningful side-effects that result in bugs, vs how many bugs we catch with assertions, so on balance I still think assertions are a good thing. My IntelliJ warns on assertions that obviously have side effects. It's not perfect of course but it does catch a lot of cases. Maybe we should fail the build on that warning?
Yes, catching |
That seems like something we should do, at least for non-test code. Any assertion statements should be side-effect free for obvious reasons. |
IIRC we discussed this and preferred to run our tests with and without assertions enabled. |
Hmm, I wasn't aware of this. I presume we would only be able to do this for external cluster tests, unless we are certain we don't ever use |
@mark-vieira I think @jasontedor suggested this. |
Indeed, we have no chance to catch the assertion with side effects issues (of which we've had a few over the years) in CI if we only run there with assertions enabled. The old suggestion of mine is indeed that we run the external cluster tests with assertions disabled. I don't think elasticsearch/server/src/main/java/org/elasticsearch/Assertions.java Lines 24 to 34 in 8fff763
|
@DaveCTurner I'm curious how you reconcile this suggestion with the example from |
+1 for keeping it the way it is today, for the reasons @DaveCTurner already pointed out. I rarely use assertions, but there are a few use cases where I don't want to fail in prod but in CI. Examples are around optimistic locking and certain optimization execution paths. Those are not catastrophic failures but good to know for debugging. I could live with enforcing a message, however I trust the developer/code reviewer to add them if they add value. |
I envisioned less a separate list of exceptions and more that one could suppress individual cases at the call site, like we do with other generally-good-but-not-universal ideas like forbidden-APIs and |
Alright, I think at best we don't have strong agreement on this but we do generally acknowledge that issues with assertion side-effects can lead to production issues. As Jason has described I've opened #69067 to specifically address that and I'll close this issue for now. |
Uh oh!
There was an error while loading. Please reload this page.
We use assertions instead of exceptions in various places throughout the Elasticsearch code base. We should probably enforce a convention here being either a) explicitly disallow the use of assertions altogether (even though we force
-ea
) and use exceptions instead or b) require that a description message be provided with the assertion to avoid things likejava.lang.AssertionError: null
in our logs.I haven't really looked into it but this seems like something we could enforce with checkstyle.
The text was updated successfully, but these errors were encountered: