-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Write next cluster state fully on all failures #73631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write next cluster state fully on all failures #73631
Conversation
Today we do not set the `LucenePersistedState#writeNextStateFully` flag on all failures, notably on an `OutOfMemoryError`. Since we don't exit immediately on an OOME we may have failed part-way through writing a full state but still proceed with another apparently-incremental write. With this commit we ensure `LucenePersistedState#writeNextStateFully` is only set if the previous write was successful.
Pinging @elastic/es-distributed (Team:Distributed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Ideally we would add a test that also exercises an Error
in writeIncrementalStateAndCommit
and writeIncrementalTermUpdateAndCommit
. The former is probably "easy" since it also uses big arrays, whereas I can see the latter needing to mock more. I think this can go in without it though if you prefer.
👍 I generalised the test to behave a bit more randomly to cover a few more branches, see f04ea2f. There's no easy way to test the term update. I pondered the failure modes of |
Today we do not set the `LucenePersistedState#writeNextStateFully` flag on all failures, notably on an `OutOfMemoryError`. Since we don't exit immediately on an OOME we may have failed part-way through writing a full state but still proceed with another apparently-incremental write. With this commit we ensure `LucenePersistedState#writeNextStateFully` is only set if the previous write was successful.
Today we do not set the `LucenePersistedState#writeNextStateFully` flag on all failures, notably on an `OutOfMemoryError`. Since we don't exit immediately on an OOME we may have failed part-way through writing a full state but still proceed with another apparently-incremental write. With this commit we ensure `LucenePersistedState#writeNextStateFully` is only set if the previous write was successful.
Today we do not set the
LucenePersistedState#writeNextStateFully
flagon all failures, notably on an
OutOfMemoryError
. Since we don't exitimmediately on an OOME we may have failed part-way through writing a
full state but still proceed with another apparently-incremental write.
With this commit we ensure
LucenePersistedState#writeNextStateFully
isonly set if the previous write was successful.