Skip to content

[7.x][ML] Prevent data frame analytics freeze after loading data (#72412) #72449

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

dimitris-athanasiou
Copy link
Contributor

When we flush the input stream the java side writes
enough spaces to fill the input stream buffer.
However, in the case of data frame analytics, this may
cause the job to freeze. The reason is that java
writes the data and flushes in the same thread that
goes on to then restore the state. However, when
c++ reads in the end-of-data control message, it stops
reading from the stream and goes on to perform the analysis.
If the 8KB of spaces do not fit in the OS buffer for the
names pipe, the java side blocks. It never proceeds with
restoring the state and this causes a job that is
being restarted and has state to freeze.

In elastic/ml-cpp#1881 the buffer has been reduced to 2KB.
This means the buffer is smaller than the buffer of all supported OS.
Note that it is 4KB on Windows.

Thus in this commit we also reduce the number of spaces we write
in order to flush the buffer to match that of the buffer size.

Closes #70698

Backport of #72412

When we flush the input stream the java side writes
enough spaces to fill the input stream buffer.
However, in the case of data frame analytics, this may
cause the job to freeze. The reason is that java
writes the data and flushes in the same thread that
goes on to then restore the state. However, when
c++ reads in the end-of-data control message, it stops
reading from the stream and goes on to perform the analysis.
If the 8KB of spaces do not fit in the OS buffer for the
names pipe, the java side blocks. It never proceeds with
restoring the state and this causes a job that is
being restarted and has state to freeze.

In elastic/ml-cpp#1881 the buffer has been reduced to 2KB.
This means the buffer is smaller than the buffer of all supported OS.
Note that it is 4KB on Windows.

Thus in this commit we also reduce the number of spaces we write
in order to flush the buffer to match that of the buffer size.

Closes elastic#70698

Backport of elastic#72412
@elasticmachine elasticmachine added the Team:ML Meta label for the ML team label Apr 29, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@dimitris-athanasiou dimitris-athanasiou changed the title [7.x][ML] Reduce input stream buffer from 8KB to 2KB (#72412) [7.x][ML] Prevent data frame analytics freeze after loading data (#72412) Apr 29, 2021
@dimitris-athanasiou dimitris-athanasiou merged commit 78f18c0 into elastic:7.x Apr 29, 2021
@dimitris-athanasiou dimitris-athanasiou deleted the prevent-dfa-freezing-due-to-blocking-on-flush-7x branch April 29, 2021 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport >bug :ml Machine learning Team:ML Meta label for the ML team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants