Skip to content

[ML] Skip execution of timed out inference requests waiting in queue #80087

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

dimitris-athanasiou
Copy link
Contributor

If by the time we get to execute an inference request the action
has already been notified, it means the request timed out while
it was waiting in the queue. We should return early from the doRun
method to avoid unnecessary work.

If by the time we get to execute an inference request the action
has already been notified, it means the request timed out while
it was waiting in the queue. We should return early from the `doRun`
method to avoid unnecessary work.
@elasticmachine elasticmachine added the Team:ML Meta label for the ML team label Oct 29, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@benwtrent benwtrent self-requested a review October 29, 2021 13:34
Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is really great. What if we add another check before we process the result? It saves just a touch of unnecessary compute time.

inferenceResultsProcessor.processResult(tokenization, pyTorchResult);

Right before that line

@dimitris-athanasiou dimitris-athanasiou merged commit e5251bc into elastic:master Oct 29, 2021
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Oct 29, 2021
…lastic#80087)

If by the time we get to execute an inference request the action
has already been notified, it means the request timed out while
it was waiting in the queue. We should return early from the `doRun`
method to avoid unnecessary work.
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.0

elasticsearchmachine pushed a commit that referenced this pull request Oct 29, 2021
…80087) (#80092)

If by the time we get to execute an inference request the action
has already been notified, it means the request timed out while
it was waiting in the queue. We should return early from the `doRun`
method to avoid unnecessary work.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning >non-issue Team:ML Meta label for the ML team v8.0.0-beta1 v8.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants