-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Preserve Task Id for ML Datafeed #54943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preserve Task Id for ML Datafeed #54943
Conversation
Pinging @elastic/ml-core (:ml) |
Hi @albertzaharovits we have other places we are using a client - specifically indexing the results and model state- should we be using |
Maybe we should in theory, but I think this is a far reaching and high risk change. I suspect we will be spending a lot of time clearing up side effects. Our tasks are very complex, fire off numerous sub-tasks, and we have dedicated open/close/stop/start APIs for orchestrating them. I strongly suspect we will get into a mess with task cancellation if we try to link our controlling job/datafeed tasks to every sub-operation they perform. @dimitris-athanasiou has already run into this with the reindex task that data frame analytics jobs kick off. |
I have not yet looked into the data frame analytics tasks and the transform tasks, but I will do that in a follow-up. My end goal is to track all client requests for all persistent tasks in the same way, but I don't see an overarching way to achieve it, without looking into every case. Do you reckon this is a valid strategy? This PR is only for the DataFeed persistent task, do you reckon this achieves the scope? I wonder how should we test it, I don't have a good plan for it. |
In terms of tracking tasks I think this is a very useful change and valuable for debugging Datafeed issues as we should now know which queries belong to which datafeed. We already handle task cancellation for the Datafeed and AD jobs which gracefully stops the datafeed waiting any executing query to finish. How does task cancellation work? If you cancel the parent task are all child tasks then cancelled or is it up to the cancellation handler of the parent task to manage the children? I agree managing the AD tasks and processes is a very complicated can of worms and echo @droberts195 caution about side effects but if this change is just about tracking and doesn't affect how task cancellation works then it is good |
I don't see anything in the docs but looking at TransportCancelTasksAction I think we have a problem in that the child tasks are also cancelled if My understanding of the current situation for datafeeds is that because the search tasks are not child tasks of the datafeed they will not be cancelled automatically and the graceful shutdown works. We could change @albertzaharovits @droberts195 what do you think? I think it is worth trying for the benefit of tracking which tasks are started by the datafeed |
@elasticmachine run elasticsearch-ci/2 |
Good point @davidkyle ! I also agree that because Datafeeds already implement graceful task cancellation we should make |
Going back to the reason why this change is being done, #52314, it looks like it's most important that the sub-tasks that get parented to the persistent task are the ones that run with the credentials of the user that created the configuration. The sub-tasks that run as an internal user don't need to be changed to assist with accurate auditing, but maybe could be changed for consistency. We have had so many problems over the years with obscure edge cases causing ML jobs/datafeeds to not stop. Ironically the failure in
Yes, I agree. This is the least risky change from where we are today. Please can you try adding that to this PR and see if any tests related to datafeeds fail as a result. If not then I think this PR is good. To clean the audit logs similar PRs will probably be needed for data frame analytics and transforms. |
Thank you for the feedback @droberts195 ! Very much appreciated.
Yes, this is what I had in mind as well. Trace the requests using the client's credentials, but don't try too hard to avoid the other internal requests doing the same.
LOL |
@droberts195 I've made the changes as suggested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This change preserves the task id for internal requests for the `StartDatafeedPersistentTask`. Task ids are a way to express a relationship between related internal requests. In this particular case, the task ids are used for debugging and (soon) security auditing, but not for task cancellation, because there is already a graceful-shutdown of child internal requests (given a task id) in place.
This change makes sure that all internal client requests spawned by the data frame analytics persistent task executor and that use the end user security credentials, have the parent task id assigned. The objective here is to permit auditing (as well as tracking for debugging purposes) of all the end-user requests executed on its behalf by persistent tasks. Because data frame analytics taks already implements graceful shutdown of child tasks, this change does not interfere with it by opting out of the persistent task cancellation of child tasks. Relates #54943 #52314
This change makes sure that all internal client requests spawned by the data frame analytics persistent task executor and that use the end user security credentials, have the parent task id assigned. The objective here is to permit auditing (as well as tracking for debugging purposes) of all the end-user requests executed on its behalf by persistent tasks. Because data frame analytics taks already implements graceful shutdown of child tasks, this change does not interfere with it by opting out of the persistent task cancellation of child tasks. Relates #54943 #52314
This change ensures that internal client requests spawned by the transform persistent task executor and that use the end user security credentials, have the parent task id assigned. The objective here is to permit auditing (as well as tracking for debugging purposes) of all the end-user requests executed on its behalf by persistent tasks. Because transform tasks already implements graceful shutdown of the child tasks, this change does not interfere with that by opting out of the persistent task cancellation of child tasks. Relates #55046 #54943 #52314 Closes #54957
This change ensures that internal client requests spawned by the transform persistent task executor and that use the end user security credentials, have the parent task id assigned. The objective here is to permit auditing (as well as tracking for debugging purposes) of all the end-user requests executed on its behalf by persistent tasks. Because transform tasks already implements graceful shutdown of the child tasks, this change does not interfere with that by opting out of the persistent task cancellation of child tasks. Relates #55046 #54943 #52314 Closes #54957
Task ids are a way to express a relationship between related internal requests.
This change preserves the task id for internal requests of the
StartDatafeedPersistentTask
, similarly to howRollupJobPersistentTask
does it.Relates #52314