Skip to content

[Transform] search failures due to script or mapping errors are not handled #48467

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hendrikmuhs opened this issue Oct 24, 2019 · 1 comment · Fixed by #48887
Closed

[Transform] search failures due to script or mapping errors are not handled #48467

hendrikmuhs opened this issue Oct 24, 2019 · 1 comment · Fixed by #48887
Assignees
Labels

Comments

@hendrikmuhs
Copy link

If a transform task fails in the search phase due to a mapping conflict or a scripting error the error is handled as a temporary search problem, search is re-tried (10 times) and eventually the task is put into FAILED state with reason: "task encountered more than 10 failures; latest failure: Partial shards failure", audit only contains "Partial shards failure".

The real issue can only be found in the logs, e.g.

Caused by: org.elasticsearch.ElasticsearchException$1: Fielddata is disabled on text fields by default. Set fielddata=true on [...] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.

or

org.elasticsearch.script.ScriptException: runtime error
...
Caused by: java.lang.IllegalArgumentException: No field found for [field_b] in mapping

Solution

We need to unwrap search failures and check for inner problems:

  • do not retry if it turns out to be a irrecoverable error (like we do for other errors like this)
  • message the real error as reason in _stats and as audit message

Repro

Case 1

  • create 2 indexes with 2 fields, use keyword fields:
    • field_a, field_b
    • field_a, field_c
  • create a transform group by field_a with a scripted metric agg that accesses field_b without a guard:
      "scripted_metric": {
          "init_script": "state.b = new String()",
          "map_script": "state.b = doc['field_b']",
          "combine_script": "return state.b",
          "reduce_script": "return states"
        }

The transform should fail with a ScriptException

Case 2

  • create 2 indexes with 2 fields, map field_a for the 2nd index to text:
    • field_a, field_b
    • field_a, field_b
  • create a transform, group by field_a

The transform should fail with an ElasticsearchException: Fielddata is disabled on text fields by default.

/CC @tsg

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml/Transform)

@hendrikmuhs hendrikmuhs self-assigned this Nov 11, 2019
hendrikmuhs pushed a commit to hendrikmuhs/elasticsearch that referenced this issue Nov 15, 2019
hendrikmuhs pushed a commit that referenced this issue Nov 18, 2019
improve error handling for script errors, treating it as irrecoverable errors which puts the task
immediately into failed state, also improves the error extraction to properly report the script 
error.

fixes #48467
hendrikmuhs pushed a commit that referenced this issue Nov 18, 2019
improve error handling for script errors, treating it as irrecoverable errors which puts the task
immediately into failed state, also improves the error extraction to properly report the script 
error.

fixes #48467
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants