Skip to content

Alias field type is not handled well in DF analytics. #50787

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
przemekwitek opened this issue Jan 9, 2020 · 2 comments
Closed

Alias field type is not handled well in DF analytics. #50787

przemekwitek opened this issue Jan 9, 2020 · 2 comments
Labels
>bug :ml Machine learning

Comments

@przemekwitek
Copy link
Contributor

With the recent change (#50219) that copies the mapping type from dependent variable to prediction field, QA regression has been found.

Wei posted the bug description:
The failure is related to handling DFA field alias. the failed test is to run an analytics against a field with alias, it passed on master before Dec11, and failed recently with this error:

"failure_reason" : """[dfa_wine_quality_red_alias_1578498260_000_0] Failed to join results: failures while writing results [failure in bulk execution:
[0]: index [dest_wine_quality_red_alias_1578516260980], id [-BZHR2wB9mzBfTtIWLPc], message [org.elasticsearch.index.mapper.MapperParsingException: failed to parse]

This is the job configuration:

{
  "id": "dfa_breast-cancer-alias_1578499743_000_0",
  "source": {
    "index": [
      "breast-cancer-alias"
    ],
    "query": {
      "match_all": {}
    }
  },
  "dest": {
    "index": "dest_breast_cancer_alias_1578517743585",
    "results_field": "ml"
  },
  "analysis": {
    "classification": {
      "dependent_variable": "class_alias",
      "num_top_classes": 2,
      "prediction_field_name": "class_alias_prediction",
      "training_percent": 100,
      "randomize_seed": 4381108523829301000
    }
  },
  "analyzed_fields": {
    "includes": [],
    "excludes": [
      "class",
      "breast-quad"
    ]
  },
  "model_memory_limit": "1gb",
  "create_time": 1578517744361,
  "version": "8.0.0",
  "allow_lazy_start": false
}

This is the mapping of breast-cancer-alias index:

"mappings" : {
      "properties" : {
        "age" : {
          "type" : "keyword"
        },
        "breast" : {
          "type" : "keyword"
        },
        "breast-quad" : {
          "type" : "keyword"
        },
        "breast-quad_alias" : {
          "type" : "alias",
          "path" : "breast-quad"
        },
        "class" : {
          "type" : "keyword"
        },
        "class_alias" : {
          "type" : "alias",
          "path" : "class"
        },
        "deg-malig" : {
          "type" : "long"
        },
        "inv-nodes" : {
          "type" : "keyword"
        },
        "irradiat" : {
          "type" : "keyword"
        },
        "menopause" : {
          "type" : "keyword"
        },
        "node-caps" : {
          "type" : "keyword"
        },
        "tumor-size" : {
          "type" : "keyword"
        }
      }
    },
@przemekwitek przemekwitek added >bug :ml Machine learning labels Jan 9, 2020
@przemekwitek przemekwitek self-assigned this Jan 9, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@przemekwitek
Copy link
Contributor Author

I was able to reproduce the issue locally, using integration test.
The failure is:

  2> java.lang.AssertionError: 
    Expected: is null
         but: was "[dependent_variable_of_type_alias] Failed to join results: failures while writing results [failure in bulk execution:\n[0]: index [dependent_variable_of_type_alias_source_index_results], id [h5oMim8BRLpRVR6aNCHK], message [org.elasticsearch.index.mapper.MapperParsingException: failed to parse]\n[1]: index [dependent_variable_of_type_alias_source_index_results], id [hpoMim8BRLpRVR6aNCHK], message [org.elasticsearch.index.mapper.MapperParsingException: failed to parse]\n[2]: index [dependent_variable_of_type_alias_source_index_results], id [i5oMim8BRLpRVR6aNCHK], message [org.elasticsearch.index.mapper.MapperParsingException: failed to parse]\n[3]: index [dependent_variable_of_type_alias_source_index_results], id [iJoMim8BRLpRVR6aNCHK], message [org.elasticsearch.index.mapper.MapperParsingException: failed to parse]\n[4]: index [dependent_variable_of_type_alias_source_index_results], id [iZoMim8BRLpRVR6aNCHK], message [org.elasticsearch.index.mapper.MapperParsingException: failed to parse]\n[5]: index [dependent_variable_of_type_alias_source_index_results], id [ipoMim8BRLpRVR6aNCHK], message [org.elasticsearch.index.mapper.MapperParsingException: failed to parse]\n[6]: index [dependent_variable_of_type_alias_source_index_results], id [j5oMim8BRLpRVR6aNCHK], message [org.elasticsearch.index.mapper.MapperParsingException: failed to parse]\n[7]: index [dependent_variable_of_type_alias_source_index_results], id [jJoMim8BRLpRVR6aNCHK], message [org.elasticsearch.index.mapper.MapperParsingException: failed to parse]\n[8]: index [dependent_variable_of_type_alias_source_index_results], id [jZoMim8BRLpRVR6aNCHK], message [org.elasticsearch.index.mapper.MapperParsingException: failed to parse]\n[9]: index [dependent_variable_of_type_alias_source_index_results], id [jpoMim8BRLpRVR6aNCHK], message [org.elasticsearch.index.mapper.MapperParsingException: failed to parse]\n[10]: index [dependent_variable_of_type_alias_source_index_results], id [kJoMim8BRLpRVR6aNCHK], message [org.elasticsearch.index.mapper.MapperParsingException: failed to parse]]"
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
        at org.junit.Assert.assertThat(Assert.java:956)
        at org.junit.Assert.assertThat(Assert.java:923)
        at org.elasticsearch.xpack.ml.integration.MlNativeDataFrameAnalyticsIntegTestCase.assertIsStopped(MlNativeDataFrameAnalyticsIntegTestCase.java:188)
        at org.elasticsearch.xpack.ml.integration.MlNativeDataFrameAnalyticsIntegTestCase.lambda$waitUntilAnalyticsIsStopped$0(MlNativeDataFrameAnalyticsIntegTestCase.java:139)
        at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:879)
        at org.elasticsearch.xpack.ml.integration.MlNativeDataFrameAnalyticsIntegTestCase.waitUntilAnalyticsIsStopped(MlNativeDataFrameAnalyticsIntegTestCase.java:139)
        at org.elasticsearch.xpack.ml.integration.MlNativeDataFrameAnalyticsIntegTestCase.waitUntilAnalyticsIsStopped(MlNativeDataFrameAnalyticsIntegTestCase.java:135)
        at org.elasticsearch.xpack.ml.integration.ClassificationIT.testDependentVariableOfTypeAlias(ClassificationIT.java:357)

    java.lang.RuntimeException: Had to resort to force-stopping jobs, something went wrong?
        at org.elasticsearch.xpack.ml.integration.MlNativeDataFrameAnalyticsIntegTestCase.stopAnalyticsAndForceStopOnError(MlNativeDataFrameAnalyticsIntegTestCase.java:98)
        at org.elasticsearch.xpack.ml.integration.MlNativeDataFrameAnalyticsIntegTestCase.cleanUpAnalytics(MlNativeDataFrameAnalyticsIntegTestCase.java:76)
        at org.elasticsearch.xpack.ml.integration.MlNativeDataFrameAnalyticsIntegTestCase.cleanUpResources(MlNativeDataFrameAnalyticsIntegTestCase.java:72)
        at org.elasticsearch.xpack.ml.integration.MlNativeIntegTestCase.cleanUp(MlNativeIntegTestCase.java:128)
        at org.elasticsearch.xpack.ml.integration.ClassificationIT.cleanup(ClassificationIT.java:79)

        Caused by:
        org.elasticsearch.ElasticsearchStatusException: cannot close data frame analytics [dependent_variable_of_type_alias] because it failed, use force stop instead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :ml Machine learning
Projects
None yet
Development

No branches or pull requests

2 participants