Skip to content

[ML] Remove the undocumented "delimited" format for post_data #74188

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

droberts195
Copy link
Contributor

The data_description of anomaly detection jobs used to accept
delimited data, although this was never documented.

This change removes the delimited option from the data_description,
and the associated functionality in post_data that handled it.

This is not a breaking change because it's removing functionality
that officially never existed. However, just in case somebody
was using it it is only removed from 8.0 and higher, so that at
least they won't find out during a patch install.

The data_description of anomaly detection jobs used to accept
delimited data, although this was never documented.

This change removes the delimited option from the data_description,
and the associated functionality in post_data that handled it.

This is not a breaking change because it's removing functionality
that officially never existed.  However, just in case somebody
was using it it is only removed from 8.0 and higher, so that at
least they won't find out during a patch install.
@elasticmachine elasticmachine added the Team:ML Meta label for the ML team label Jun 16, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love seeing all the code go.

I just want to triple check: Users who use POST data have been getting deprecation warnings about the CSV format in 7.x. So, they are prepared for this change. Yeah, its undocumented, but folks could be doing all sorts of things.

@droberts195
Copy link
Contributor Author

I just want to triple check: Users who use POST data have been getting deprecation warnings about the CSV format in 7.x. So, they are prepared for this change. Yeah, its undocumented, but folks could be doing all sorts of things.

The deprecation warning has been here since 5.4:

if (job.getDataDescription() != null && job.getDataDescription().getFormat() == DataDescription.DataFormat.DELIMITED) {
deprecationLogger.deprecate(DeprecationCategory.API, "ml_create_job_delimited_data",
"Creating jobs with delimited data format is deprecated. Please use xcontent instead.");
}

So they were warned when they first created the job. The UI has never provided a way to create jobs that take delimited data, and it's never been documented.

If this was an experimental feature we'd be allowed to remove it. I think an undocumented feature carries even fewer guarantees than an experimental feature.

The other thing is, if somebody had been using a job by posting delimited data to it, we don't delete the job or its results on upgrade. They could continue to use it by switching to posting JSON data instead.

@droberts195 droberts195 added the Team:Clients Meta label for clients team label Jun 17, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/clients-team (Team:Clients)

@droberts195
Copy link
Contributor Author

@elastic/clients-team the change needed here is to remove field_delimiter from the DataDescription object, i.e. this line: https://github.com/elastic/elasticsearch-specification/blob/8b0e9ed594e3f10c1e36836222a3d3f0d87ed9f0/specification/ml/_types/Job.ts#L127

field_delimiter has never officially existed, but since the current clients include it it should probably remain available in the clients until 7.last then be removed in 8.0 to match what this PR does. quote_character (also undocumented from the start) never made it into the clients as it was never mentioned in a YAML test. I think it's best not to add that to 7.x.

@droberts195 droberts195 merged commit c9a6136 into elastic:master Jun 17, 2021
@droberts195 droberts195 deleted the remove_delimited_option_for_post_data branch June 17, 2021 14:30
droberts195 added a commit to elastic/kibana that referenced this pull request Jun 18, 2021
… Elasticsearch (#102506)

This a companion to elastic/elasticsearch#74188.

This PR is functionally a no-op, as the removed method
was not called anywhere. But it is sensible to remove
it to prevent it being called in the future now that it
references fields that don't exist in Elasticsearch.
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Jun 18, 2021
… Elasticsearch (elastic#102506)

This a companion to elastic/elasticsearch#74188.

This PR is functionally a no-op, as the removed method
was not called anywhere. But it is sensible to remove
it to prevent it being called in the future now that it
references fields that don't exist in Elasticsearch.
kibanamachine added a commit to elastic/kibana that referenced this pull request Jun 18, 2021
… Elasticsearch (#102506) (#102620)

This a companion to elastic/elasticsearch#74188.

This PR is functionally a no-op, as the removed method
was not called anywhere. But it is sensible to remove
it to prevent it being called in the future now that it
references fields that don't exist in Elasticsearch.

Co-authored-by: David Roberts <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning >non-issue Team:Clients Meta label for clients team Team:ML Meta label for the ML team v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants