Simplify BucketedSort/Teach BitArray a useful trick (backport #53199) #53240
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Our lovely
BitArray
compactly stores "flags", lazilly growing itsunderlying storage. It is super useful when you need to store one bit of
data for a zillion buckets or a documents or something. Usefully, it
defaults to
false
. But there is a wrinkle! If you ask it whether ornot a bit is set but it hasn't grown its underlying storage array
"around" that index then it'll throw an
ArrayIndexOutOfBoundsException
.The per-document use cases tend to show up in order and don't tend to
mind this too much. But the use case in aggregations, the per-bucket use
case, does. Because buckets are collected out of order all the time.
This changes
BitArray
so it'll returnfalse
if the index is too bigfor the underlying storage. After all, that index can't have been set
or else we would have grown the underlying array. Logically, I believe
this makes sense. And it makes my life easy. At the cost of three lines.
but this adds an extra test to every call to
get
. I think this islikely ok because it is "very close" to an array index lookup that
already runs the same test. So I think it'll end up merged with the
array bounds check.