-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Fix binary docvalue_fields with padding #70826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix binary docvalue_fields with padding #70826
Conversation
Pinging @elastic/es-search (Team:Search) |
2a4599a
to
ffbbc0c
Compare
Previously docvalue_fields for binary values with paddings did not output padding. We consider it to be a bug because: 1) es would not be able parse these values 2) output from source filtering and fields API is different and does output padding. This patches fixes this by outputing padding for binary docvalue_fields where it is present. Closes elastic#70244
ffbbc0c
to
0f5de93
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you like to enable BinaryFieldMapperTests#generateRandomInputValue
in this PR too? It was disabled because of this. I think it'd be as simple as making that method return some random binary instead of assumeFalse
.
rest-api-spec/src/main/resources/rest-api-spec/test/search/350_binary_field.yml
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/index/mapper/BinaryFieldMapperTests.java
Outdated
Show resolved
Hide resolved
if (rarely()) { | ||
return null; | ||
} else { | ||
byte[] value = randomAlphaOfLengthBetween(1, 30).getBytes(StandardCharsets.UTF_8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be nice to also try the case where these aren't utf-8 bytes. Just some byte array of pure random stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great comment! I guess it makes sense just always use some random bytes, I guess this is what Binary field is intended for – to store random binary values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 3aa1286
Thanks for iterating! I left a request to add one more case. We have a surprising amount of code that things all byte arrays are utf-8 characters. I'm sort of hoping to create a trap for code like that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
Previously docvalue_fields for binary values with paddings did not output padding. We consider it to be a bug because: 1) es would not be able parse these values 2) output from source filtering and fields API is different and does output padding. This patches fixes this by outputing padding for binary docvalue_fields where it is present. Closes elastic#70244 Backport for elastic#70826
Previously docvalue_fields for binary values with paddings did not output padding. We consider it to be a bug because: 1) es would not be able parse these values 2) output from source filtering and fields API is different and does output padding. This patches fixes this by outputing padding for binary docvalue_fields where it is present. Closes #70244 Backport for #70826
Previously docvalue_fields for binary values with paddings did not
output padding. We consider it to be a bug because: 1) es would
not be able parse these values 2) output from source filtering
and fields API is different and does output padding.
This patches fixes this by outputing padding for binary
docvalue_fields where it is present.
Closes #70244