Skip to content

[DOCS] Adds sync to data frame transform API #44254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 17, 2019
Merged

Conversation

lcawl
Copy link
Contributor

@lcawl lcawl commented Jul 12, 2019

This PR adds the sync property to the create data frame transform API reference (https://www.elastic.co/guide/en/elasticsearch/reference/master/put-data-frame-transform.html)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-docs

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core


`sync`::
(Optional, object) Defines the properties required to run continuously.
`field`:::

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is another level between sync and field/delay, its "sync" -> "time" -> "field"/"delay" (see your example below)

sync defines continuous mode, time defines that the sync method is using a timestamp field in the data. In future we will have other ways to sync source and dest, e.g. for non timeseries data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've drafted a description of that time property.

(Required, string) The date field that is used to identify new documents in
the source.
`delay`:::
(Optional, time units) The time delay between the current time and the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we identified yesterday that we need a default value for delay, I will update this comment once we agreed on a value.

@sophiec20
Copy link
Contributor

Adding prev discussion notes for visibility

Please highlight that we recommend that you select a field that represents time of ingest for your time field. This is because we use the field to check which entities have changed since the last time we checked. If you have an ingest time field, then we can easily find what has changed. If you use a different time field e.g. time of event, then you may need to make sure delay is set to a large enough value to allow for any delays due to the elapsed time it takes for data to reach the system... This is a link to an example for using an ingest pipeline to set an ingest timestamp https://www.elastic.co/guide/en/elasticsearch/reference/7.2/accessing-data-in-pipelines.html#accessing-ingest-metadata

The delay default is 60s.

@lcawl lcawl marked this pull request as ready for review July 16, 2019 23:47
@lcawl lcawl removed the WIP label Jul 16, 2019
@lcawl
Copy link
Contributor Author

lcawl commented Jul 16, 2019

Thanks for the feedback @hendrikmuhs and @sophiec20

Copy link
Contributor

@sophiec20 sophiec20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lcawl lcawl merged commit 4e75fa3 into elastic:master Jul 17, 2019
@lcawl lcawl deleted the df-sync branch July 17, 2019 15:55
lcawl added a commit to lcawl/elasticsearch that referenced this pull request Jul 17, 2019
@jpountz jpountz added v7.3.0 and removed v7.3.1 labels Jul 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants