Skip to content

[FEATURE][ML] Add checksum checks on dataframe result joining #37259

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

dimitris-athanasiou
Copy link
Contributor

In order to sanity check that analytics results are joined
correctly with their corresponding dataframe rows, we write
a 32-bit hash of the document ids to the c++ process which
includes it in the results. Upon joining we check the id
hashes match.

@dimitris-athanasiou dimitris-athanasiou added the :ml Machine learning label Jan 9, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@dimitris-athanasiou
Copy link
Contributor Author

This is the java side of elastic/ml-cpp#358

@dimitris-athanasiou dimitris-athanasiou changed the title [FEATURE][ML] Add id hash checks on dataframe result joining [FEATURE][ML] Add checksum checks on dataframe result joining Jan 11, 2019
@dimitris-athanasiou
Copy link
Contributor Author

I've pushed a commit to actually use a checksum of all relevant values instead of just the document id.

In order to sanity check that analytics results are joined
correctly with their corresponding dataframe rows, we write
a 32-bit hash of the document ids to the c++ process which
includes it in the results. Upon joining we check the id
hashes match.
Copy link
Contributor

@tveasey tveasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dimitris-athanasiou dimitris-athanasiou force-pushed the add-id-hash-checks-on-data-frame-result-joining branch from 13bd906 to c04fa60 Compare January 11, 2019 17:01
@dimitris-athanasiou dimitris-athanasiou merged commit 9fd1d10 into elastic:feature-ml-data-frame-analytics Jan 11, 2019
@dimitris-athanasiou dimitris-athanasiou deleted the add-id-hash-checks-on-data-frame-result-joining branch January 11, 2019 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants