Skip to content

Remove node and id from list task response #31253

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
nik9000 opened this issue Jun 11, 2018 · 9 comments
Open

Remove node and id from list task response #31253

nik9000 opened this issue Jun 11, 2018 · 9 comments
Labels
>breaking :Distributed Coordination/Task Management Issues for anything around the Tasks API - both persistent and node level. stalled Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

Comments

@nik9000
Copy link
Member

nik9000 commented Jun 11, 2018

Right now the list tasks response returns both a node (string) and an id (number). The trouble is that everywhere else we refer to the task id as a string. The fact that the list tasks response has both of these elements implies some contract for the shape of a task id when we do not intend for there to be one. Currently the task id consists of the node, a :, and then the id but we might change it in the future without any prior warning.

I think we should remove node and id from the lists tasks response entirely and users should use the object's name in the response as the id.

This is what the response looks like now:

{
    "nodes": {
        "NCvmGYS-RsW2X8JxEYumgA": {
            "name": "NCvmGYS",
            "transport_address": "127.0.0.1:9300",
            "host": "127.0.0.1",
            "ip": "127.0.0.1:9300",
            "roles": [
                "master",
                "data",
                "ingest"
            ],
            "tasks": {
                "NCvmGYS-RsW2X8JxEYumgA:1204320": {
                    "node": "NCvmGYS-RsW2X8JxEYumgA",
                    "id": 1204320,
                    "type": "transport",
                    "action": "indices:data/write/delete/byquery",
                    "start_time_in_millis": 1528310404882,
                    "running_time_in_nanos": 24218556132,
                    "cancellable": true
                }
            }
        }
    }
}

And I want it to be:

{
    "nodes": {
        "NCvmGYS-RsW2X8JxEYumgA": {
            "name": "NCvmGYS",
            "transport_address": "127.0.0.1:9300",
            "host": "127.0.0.1",
            "ip": "127.0.0.1:9300",
            "roles": [
                "master",
                "data",
                "ingest"
            ],
            "tasks": {
                "NCvmGYS-RsW2X8JxEYumgA:1204320": {
                    "type": "transport",
                    "action": "indices:data/write/delete/byquery",
                    "start_time_in_millis": 1528310404882,
                    "running_time_in_nanos": 24218556132,
                    "cancellable": true
                }
            }
        }
    }
}
@nik9000 nik9000 added >breaking :Distributed Coordination/Task Management Issues for anything around the Tasks API - both persistent and node level. v7.0.0 labels Jun 11, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@diaojinggang
Copy link

diaojinggang commented Jun 22, 2018

When we designed the response format in the very beginning. Why don't we use a format like this? Give node_id and task_id proper keys.

{
    "nodes": [
         {  "node_id" : "NCvmGYS-RsW2X8JxEYumgA",
            "name": "NCvmGYS",
            "transport_address": "127.0.0.1:9300",
            "host": "127.0.0.1",
            "ip": "127.0.0.1:9300",
            "roles": [
                "master",
                "data",
                "ingest"
            ],
            "tasks": [
                {   "task_id": "NCvmGYS-RsW2X8JxEYumgA:1204320",
                    "type": "transport",
                    "action": "indices:data/write/delete/byquery",
                    "start_time_in_millis": 1528310404882,
                    "running_time_in_nanos": 24218556132,
                    "cancellable": true
                }
            ]
        }
    ]
}

My use case was I need to extract the task_id "NCvmGYS-RsW2X8JxEYumgA:1204320". But there is no key for `"NCvmGYS-RsW2X8JxEYumgA:1204320". It took me extra five minutes to figure out how to extract that info without a key-value pair format. It's not a big deal but I am curious why we did not use the key-value format when we designed it? Just to save space?

@nik9000
Copy link
Member Author

nik9000 commented Jun 22, 2018

The task id is meant to be an implementation detail. We can change how they are formed at any time so long as the old versions of Elasticsearch continue to route the task fetch requests to the right node. We have no plans to change it but we wanted to keep our options open.

@DaveCTurner
Copy link
Contributor

I have marked this as team-discuss because I think it is a good idea and want to discuss with the team how we should approach this while preserving BWC.

@DaveCTurner
Copy link
Contributor

We continue to think this is a good idea, but the BWC issues make it tricky right now. However, we are forming a more general plan around breaking changes to response formats like this, and think that this work will let us address exactly this issue. Marking this as stalled until that work lands.

@rjernst rjernst added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label May 4, 2020
@arteam arteam added v8.1.0 and removed v8.0.0 labels Jan 12, 2022
@mark-vieira mark-vieira added v8.2.0 and removed v8.1.0 labels Feb 2, 2022
@mark-vieira mark-vieira added v8.5.0 and removed v8.4.0 labels Jul 27, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@tlrx
Copy link
Member

tlrx commented Jul 29, 2022

@DaveCTurner do you remember why we think it is a good idea? We discussed this issue in smaller group today and our feeling is that this issue gained not much attraction in the last year and introduces some breaking/compatibility work that is maybe not worth to effort given our future prioritized work. I'm tempted to close it for now.

@DaveCTurner
Copy link
Contributor

do you remember why we think it is a good idea?

Users should treat task IDs as an opaque string such as NCvmGYS-RsW2X8JxEYumgA:1204320, and should avoid thinking about them as a node ID prepended onto some numeric value because we might want to change that format in future.

I'd think it would be ok to keep the node field (it makes sense to think about the node that's running a task) but we should drop the id field. Now that we have a versioned REST API I think we can do this fairly gracefully - only include the id field in API versions ≤8, but drop it in v9+.

@nik9000
Copy link
Member Author

nik9000 commented Jul 29, 2022

I'd think it would be ok to keep the node field (it makes sense to think about the node that's running a task) but we should drop the id field.

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking :Distributed Coordination/Task Management Issues for anything around the Tasks API - both persistent and node level. stalled Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.
Projects
None yet
Development

No branches or pull requests