Skip to content

Fix binary docvalue_fields with padding #70826

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

mayya-sharipova
Copy link
Contributor

Previously docvalue_fields for binary values with paddings did not
output padding. We consider it to be a bug because: 1) es would
not be able parse these values 2) output from source filtering
and fields API is different and does output padding.

This patches fixes this by outputing padding for binary
docvalue_fields where it is present.

Closes #70244

@mayya-sharipova mayya-sharipova added the :Search/Search Search-related issues that do not fall into other categories label Mar 24, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Mar 24, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@mayya-sharipova mayya-sharipova added v8.0.0 v7.13.0 >bug and removed Team:Search Meta label for search team labels Mar 24, 2021
@mayya-sharipova mayya-sharipova force-pushed the binary-fields_doc_values branch from 2a4599a to ffbbc0c Compare March 24, 2021 18:43
Previously docvalue_fields for binary values with paddings did not
output padding. We consider it to be a bug because: 1) es would
not be able parse these values 2) output from source filtering
and fields API is different and does output padding.

This patches fixes this by outputing padding for binary
docvalue_fields where it is present.

Closes elastic#70244
@mayya-sharipova mayya-sharipova force-pushed the binary-fields_doc_values branch from ffbbc0c to 0f5de93 Compare March 24, 2021 20:45
@mayya-sharipova mayya-sharipova requested a review from nik9000 March 26, 2021 13:40
Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you like to enable BinaryFieldMapperTests#generateRandomInputValue in this PR too? It was disabled because of this. I think it'd be as simple as making that method return some random binary instead of assumeFalse.

@mayya-sharipova
Copy link
Contributor Author

@nik9000 Thank you for your review. I've addressed your comments in 73be1d9

if (rarely()) {
return null;
} else {
byte[] value = randomAlphaOfLengthBetween(1, 30).getBytes(StandardCharsets.UTF_8);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be nice to also try the case where these aren't utf-8 bytes. Just some byte array of pure random stuff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great comment! I guess it makes sense just always use some random bytes, I guess this is what Binary field is intended for – to store random binary values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 3aa1286

@nik9000
Copy link
Member

nik9000 commented Mar 26, 2021

Thanks for iterating! I left a request to add one more case. We have a surprising amount of code that things all byte arrays are utf-8 characters. I'm sort of hoping to create a trap for code like that.

@nik9000 nik9000 mentioned this pull request Mar 26, 2021
24 tasks
Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@mayya-sharipova mayya-sharipova merged commit ccfdbb4 into elastic:master Mar 26, 2021
@mayya-sharipova mayya-sharipova deleted the binary-fields_doc_values branch March 26, 2021 20:18
@mayya-sharipova mayya-sharipova added the :Search Foundations/Mapping Index mappings, including merging and defining field types label Mar 26, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Mar 26, 2021
@mayya-sharipova mayya-sharipova removed the :Search/Search Search-related issues that do not fall into other categories label Mar 26, 2021
mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this pull request Mar 29, 2021
Previously docvalue_fields for binary values with paddings did not
output padding. We consider it to be a bug because: 1) es would
not be able parse these values 2) output from source filtering
and fields API is different and does output padding.

This patches fixes this by outputing padding for binary
docvalue_fields where it is present.

Closes elastic#70244
Backport for elastic#70826
mayya-sharipova added a commit that referenced this pull request Mar 29, 2021
Previously docvalue_fields for binary values with paddings did not
output padding. We consider it to be a bug because: 1) es would
not be able parse these values 2) output from source filtering
and fields API is different and does output padding.

This patches fixes this by outputing padding for binary
docvalue_fields where it is present.

Closes #70244
Backport for #70826
@jakelandis jakelandis removed the v8.0.0 label Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team v7.13.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fetch fields API inconsistent with doc_values for binary fields
4 participants