Skip to content

Include reason in TaskCancelledException #74825

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dimitris-athanasiou opened this issue Jul 1, 2021 · 2 comments · Fixed by #75332
Closed

Include reason in TaskCancelledException #74825

dimitris-athanasiou opened this issue Jul 1, 2021 · 2 comments · Fixed by #75332
Labels
:Distributed Coordination/Task Management Issues for anything around the Tasks API - both persistent and node level. >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

Comments

@dimitris-athanasiou
Copy link
Contributor

The task framework allows cancelling a task. The cancel task request also accepts a reason. However, that reason is not a part of the TaskCancelledException. Incorporating the reason into the TaskCancelledException would improve the information a user receives when their request fails because the corresponding task was cancelled.

There is a number of places where TaskCancelledException is thrown. For example, here it is thrown because a child task is tried to be started but the parent has been cancelled. The user will see "The parent task was cancelled, shouldn't start any child tasks" but not the reason the task has been cancelled, which would be greatly more informative as to what is going on.

I will also mention an ML use case to highlight the benefit of including the reason.

For ML anomaly detection jobs, we have a reset and a delete API. They both result in the deletion of the results and state of the job, which could be a long running task. If a user calls the delete API while a reset task is ongoing, it makes no sense to fail the call. Delete should take precedence. Thus, the delete API checks if there's a reset task running and cancels it with a reason deleting job. Then the caller of the reset API receives a response that contains the TaskCancelledException. However, that does not include the reason so the user does not get the critical piece of information that their reset was cancelled because some other user deleted the job entirely.

@dimitris-athanasiou dimitris-athanasiou added >enhancement :Distributed Coordination/Task Management Issues for anything around the Tasks API - both persistent and node level. needs:triage Requires assignment of a team area label labels Jul 1, 2021
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jul 1, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@DaveCTurner DaveCTurner removed the needs:triage Requires assignment of a team area label label Jul 1, 2021
@DaveCTurner
Copy link
Contributor

As things stand today, the task framework only creates a TaskCancelledException in two places and the only one that doesn't include the reason is the one you linked:

throw new TaskCancelledException("The parent task was cancelled, shouldn't start any child tasks");

I think including the reason makes sense here, although in fact deleting job looks to be pretty much the only place that we fill in a meaningful reason for cancellation which I guess is why this hasn't come up much before.

There's a broader question about whether we should leave it up to callers to check for cancellation and construct their own TaskCancelledException as much as we do today, since almost none of these exceptions check the reason for cancellation either.

DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue Jul 14, 2021
Today when a task is cancelled we record the reason for the cancellation
but this information is very rarely exposed to users. This commit
centralises the construction of the `TaskCancellationException` and
includes the reason in the exception message.

Closes elastic#74825
DaveCTurner added a commit that referenced this issue Jul 27, 2021
Today when a task is cancelled we record the reason for the cancellation
but this information is very rarely exposed to users. This commit
centralises the construction of the `TaskCancellationException` and
includes the reason in the exception message.

Closes #74825
DaveCTurner added a commit that referenced this issue Jul 27, 2021
Today when a task is cancelled we record the reason for the cancellation
but this information is very rarely exposed to users. This commit
centralises the construction of the `TaskCancellationException` and
includes the reason in the exception message.

Closes #74825
ywangd pushed a commit to ywangd/elasticsearch that referenced this issue Jul 30, 2021
Today when a task is cancelled we record the reason for the cancellation
but this information is very rarely exposed to users. This commit
centralises the construction of the `TaskCancellationException` and
includes the reason in the exception message.

Closes elastic#74825
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Task Management Issues for anything around the Tasks API - both persistent and node level. >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants