Skip to content

[ML] No error when datafeed stops during updating to started #46495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

dimitris-athanasiou
Copy link
Contributor

Investigating the test failure reported in #45518 it appears that
the datafeed task was not found during a tast state update. There
are only two places where such an update is performed: when we set
the state to started and when we set it to stopping. We handle
ResourceNotFoundException in the latter but not in the former.

Thus the test reveals a rare race condition where the datafeed gets
requested to stop before we managed to update its state to started.
I could not reproduce this scenario but it would be my best guess.

This commit catches ResourceNotFoundException while updating the
state to started and lets the task terminate smoothly.

Closes #45518

Investigating the test failure reported in elastic#45518 it appears that
the datafeed task was not found during a tast state update. There
are only two places where such an update is performed: when we set
the state to `started` and when we set it to `stopping`. We handle
`ResourceNotFoundException` in the latter but not in the former.

Thus the test reveals a rare race condition where the datafeed gets
requested to stop before we managed to update its state to `started`.
I could not reproduce this scenario but it would be my best guess.

This commit catches `ResourceNotFoundException` while updating the
state to `started` and lets the task terminate smoothly.

Closes elastic#45518
@dimitris-athanasiou dimitris-athanasiou added >test Issues or PRs that are addressing/adding tests :ml Machine learning v8.0.0 v7.5.0 labels Sep 9, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dimitris-athanasiou dimitris-athanasiou merged commit 5bb796f into elastic:master Sep 10, 2019
@dimitris-athanasiou dimitris-athanasiou deleted the no-error-when-datafeed-stops-before-updating-to-started branch September 10, 2019 13:30
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Sep 10, 2019
…astic#46495)

Investigating the test failure reported in elastic#45518 it appears that
the datafeed task was not found during a tast state update. There
are only two places where such an update is performed: when we set
the state to `started` and when we set it to `stopping`. We handle
`ResourceNotFoundException` in the latter but not in the former.

Thus the test reveals a rare race condition where the datafeed gets
requested to stop before we managed to update its state to `started`.
I could not reproduce this scenario but it would be my best guess.

This commit catches `ResourceNotFoundException` while updating the
state to `started` and lets the task terminate smoothly.

Closes elastic#45518

Backport of elastic#46495
dimitris-athanasiou added a commit that referenced this pull request Sep 11, 2019
…6495) (#46542)

Investigating the test failure reported in #45518 it appears that
the datafeed task was not found during a tast state update. There
are only two places where such an update is performed: when we set
the state to `started` and when we set it to `stopping`. We handle
`ResourceNotFoundException` in the latter but not in the former.

Thus the test reveals a rare race condition where the datafeed gets
requested to stop before we managed to update its state to `started`.
I could not reproduce this scenario but it would be my best guess.

This commit catches `ResourceNotFoundException` while updating the
state to `started` and lets the task terminate smoothly.

Closes #45518

Backport of #46495
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning >test Issues or PRs that are addressing/adding tests v7.5.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DatafeedJobsIT.testRealtime_multipleStopCalls failure on CI because task doesn't exist
4 participants