Skip to content

[ML] adding delayed_data_check_config to datafeed update docs #42095

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 28, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 16 additions & 14 deletions docs/reference/ml/apis/datafeedresource.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -61,12 +61,12 @@ A {dfeed} resource has the following properties:

`delayed_data_check_config`::
(object) Specifies whether the data feed checks for missing data and
and the size of the window. For example:
the size of the window. For example:
`{"enabled": true, "check_window": "1h"}` See
<<ml-datafeed-delayed-data-check-config>>.

[[ml-datafeed-chunking-config]]
==== Chunking Configuration Objects
==== Chunking configuration objects

{dfeeds-cap} might be required to search over long time periods, for several months
or years. This search is split into time chunks in order to ensure the load
Expand All @@ -88,31 +88,33 @@ A chunking configuration object has the following properties:
For example: `3h`.

[[ml-datafeed-delayed-data-check-config]]
==== Delayed Data Check Configuration Objects
==== Delayed data check configuration objects

The {dfeed} can optionally search over indices that have already been read in
an effort to find if any data has since been added to the index. If missing data
is found, it is a good indication that the `query_delay` option is set too low and
the data is being indexed after the {dfeed} has passed that moment in time. See
an effort to determine whether any data has subsequently been added to the index.
If missing data is found, it is a good indication that the `query_delay` option
is set too low and the data is being indexed after the {dfeed} has passed that
moment in time. See
{stack-ov}/ml-delayed-data-detection.html[Working with delayed data].

This check only runs on real-time {dfeeds}
This check runs only on real-time {dfeeds}.

The configuration object has the following properties:

`enabled`::
(boolean) Should the {dfeed} periodically check for data being indexed after reading.
Defaults to `true`
(boolean) Specifies whether the {dfeed} periodically checks for delayed data.
Defaults to `true`.

`check_window`::
(time units) The window of time before the latest finalized bucket that should be searched
for late data. Defaults to `null` which causes an appropriate `check_window` to be calculated
when the real-time {dfeed} runs.
The default `check_window` span calculation is the max between `2h` or `8 * bucket_span`.
(time units) The window of time that is searched for late data. This window of
time ends with the latest finalized bucket. It defaults to `null`, which
causes an appropriate `check_window` to be calculated when the real-time
{dfeed} runs. In particular, the default `check_window` span calculation is
based on the maximum of `2h` or `8 * bucket_span`.

[float]
[[ml-datafeed-counts]]
==== {dfeed-cap} Counts
==== {dfeed-cap} counts

The get {dfeed} statistics API provides information about the operational
progress of a {dfeed}. All of these properties are informational; you cannot
Expand Down
9 changes: 5 additions & 4 deletions docs/reference/ml/apis/put-datafeed.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,11 @@ IMPORTANT: You must use {kib} or this API to create a {dfeed}. Do not put a {df
(object) Specifies how data searches are split into time chunks.
See <<ml-datafeed-chunking-config>>.

`delayed_data_check_config`::
(object) Specifies whether the data feed checks for missing data and
the size of the window. See
<<ml-datafeed-delayed-data-check-config>>.

`frequency`::
(time units) The interval at which scheduled queries are made while the {dfeed}
runs in real time. The default value is either the bucket span for short
Expand Down Expand Up @@ -82,10 +87,6 @@ IMPORTANT: You must use {kib} or this API to create a {dfeed}. Do not put a {df
(unsigned integer) The `size` parameter that is used in {es} searches.
The default value is `1000`.

`delayed_data_check_config`::
(object) Specifies if and with how large a window should the data feed check
for missing data. See <<ml-datafeed-delayed-data-check-config>>.

For more information about these properties,
see <<ml-datafeed-resource>>.

Expand Down
9 changes: 8 additions & 1 deletion docs/reference/ml/apis/update-datafeed.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,10 @@ Updates certain properties of a {dfeed}.

`POST _ml/datafeeds/<feed_id>/_update`

//===== Description
===== Description

NOTE: If you update the `delayed_data_check_config` property, you must stop and
start the {dfeed} for the change to be applied.

==== Path Parameters

Expand All @@ -32,6 +35,10 @@ The following properties can be updated after the {dfeed} is created:
`chunking_config`::
(object) Specifies how data searches are split into time chunks.
See <<ml-datafeed-chunking-config>>.

`delayed_data_check_config`::
(object) Specifies whether the data feed checks for missing data and
the size of the window. See <<ml-datafeed-delayed-data-check-config>>.

`frequency`::
(time units) The interval at which scheduled queries are made while the
Expand Down