Skip to content

update fails silently if script and doc update are both in one line #34069

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
robin13 opened this issue Sep 26, 2018 · 5 comments
Closed

update fails silently if script and doc update are both in one line #34069

robin13 opened this issue Sep 26, 2018 · 5 comments
Assignees
Labels
>bug :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. help wanted adoptme

Comments

@robin13
Copy link
Contributor

robin13 commented Sep 26, 2018

Elasticsearch version (bin/elasticsearch --version): 6.4.1

Plugins installed: []

JVM version (java -version): OpenJDK 64-bit Server VM 10.0.2

OS version (uname -a if on a Unix-like system): Linux 4.15.0-34-generic #37-Ubuntu SMP

Description of the problem including expected versus actual behavior:

If you (accidentally) include both a script and an update:doc action in a bulk update, it does not always return the appropriate error and can fail to execute a script silently.

Steps to reproduce:

Setup:

PUT /test
PUT /test/_doc/test
{  "message":"hello world" }

The expected behaviour - when both script and update:doc actions are used in this order:

POST _bulk
{"update":{"_index":"test","_type":"_doc","_id":"test"}}
{"script":{"source":"ctx._source.message = 'set by script'"},"update":{"doc":{"message":"set by update:doc"}}}

Elasticsearch correctly responds with the error:

{
  "error": {
    "root_cause": [
      {
        "type": "action_request_validation_exception",
        "reason": "Validation Failed: 1: can't provide both script and doc;"
      }
    ],
    "type": "action_request_validation_exception",
    "reason": "Validation Failed: 1: can't provide both script and doc;"
  },
  "status": 400
}

But if you switch the order putting update:doc first, then script:

POST _bulk
{"update":{"_index":"test","_type":"_doc","_id":"test"}}
{"update":{"doc":{"message":"set by update:doc"}},"script":{"source":"ctx._source.message = 'set by script'"}}

It returns successfully:

{
  "took": 9,
  "errors": false,
  "items": [
    {
      "update": {
        "_index": "test",
        "_type": "_doc",
        "_id": "test",
        "_version": 5,
        "result": "updated",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "_seq_no": 4,
        "_primary_term": 1,
        "status": 200
      }
    }
  ]
}

But only the update has been done, and the script section has been silently ignored:

GET /test/_doc/test
{
  "_index": "test",
  "_type": "_doc",
  "_id": "test",
  "_version": 5,
  "found": true,
  "_source": {
    "message": "set by update:doc"
  }
}

It would be good to

  1. Fail consistently when both update:doc and script are provided regardless of order
  2. Add explicit details to the documentation noting that only one action can be used per update
@rjernst
Copy link
Member

rjernst commented Sep 26, 2018

The bulk api uses a hand rolled parser that is quite complex. Only the main actions (delete, create, index, update) are looked for. My hunch is the success you got is not silently ignore script failure, but never even parsing the script portion, likely due to the scope of the slice parser used to parse the update portion (see BulkRequest line ~419).

@rjernst rjernst added the :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. label Sep 26, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@original-brownbear original-brownbear self-assigned this Sep 27, 2018
@original-brownbear original-brownbear removed their assignment Oct 15, 2018
@vladimirdolzhenko
Copy link
Contributor

#29293 changed behaviour in 7.0 - it is not possible to specify bulk request as

{"update":{"_index":"test","_type":"_doc","_id":"test"}}
{"update":{"doc":{"message":"set by update:doc"}},"script":{"source":"ctx._source.message = 'set by script'"}}

unnecessary update leads to a parser exception like [UpdateRequest] unknown field [update], parser not found

for 7.0 both following cases throw validation exception like you've got can't provide both script and doc:

{"update":{"_index":"test","_type":"_doc","_id":"test"}}
{"doc":{"message":"set by update:doc"},"script":{"source":"ctx._source.message = 'set by script'"}}

and

{"update":{"_index":"test","_type":"_doc","_id":"test"}}
{"script":{"source":"ctx._source.message = 'set by script'"}, "doc":{"message":"set by update:doc"}}

@vladimirdolzhenko
Copy link
Contributor

@rjernst problem in UpdateRequest#fromXContent and how it parses {"update":{"doc":{"message":"set by update:doc"}}, - everything that goes after comma is ignored.

vladimirdolzhenko added a commit to vladimirdolzhenko/elasticsearch that referenced this issue Nov 5, 2018
@vladimirdolzhenko vladimirdolzhenko self-assigned this Nov 6, 2018
vladimirdolzhenko added a commit that referenced this issue Nov 6, 2018
@vladimirdolzhenko
Copy link
Contributor

#29293 is a proper fix for 7.0 already applied.

#35257 is a fix for 6.x branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. help wanted adoptme
Projects
None yet
Development

No branches or pull requests

5 participants