Skip to content

[DOCS] Add how-to guide for time series data #71195

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Apr 5, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 10 additions & 3 deletions docs/reference/data-streams/set-up-a-data-stream.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ To create an index lifecycle policy in {kib}, open the main menu and go to

You can also use the <<ilm-put-lifecycle,create lifecycle policy API>>.

// tag::ilm-policy-api-ex[]
[source,console]
----
PUT _ilm/policy/my-lifecycle-policy
Expand All @@ -38,7 +39,6 @@ PUT _ilm/policy/my-lifecycle-policy
"hot": {
"actions": {
"rollover": {
"max_age": "30d",
"max_primary_shard_size": "50gb"
}
}
Expand All @@ -58,15 +58,15 @@ PUT _ilm/policy/my-lifecycle-policy
"min_age": "60d",
"actions": {
"searchable_snapshot": {
"snapshot_repository": "my-snapshot-repo"
"snapshot_repository": "found-snapshots"
}
}
},
"frozen": {
"min_age": "90d",
"actions": {
"searchable_snapshot": {
"snapshot_repository": "my-snapshot-repo"
"snapshot_repository": "found-snapshots"
}
}
},
Expand All @@ -80,11 +80,13 @@ PUT _ilm/policy/my-lifecycle-policy
}
}
----
// end::ilm-policy-api-ex[]

[discrete]
[[create-component-templates]]
=== Step 2. Create component templates

// tag::ds-create-component-templates[]
A data stream requires a matching index template. In most cases, you compose
this index template using one or more component templates. You typically use
separate component templates for mappings and index settings. This lets you
Expand Down Expand Up @@ -156,11 +158,13 @@ PUT _component_template/my-settings
}
----
// TEST[continued]
// end::ds-create-component-templates[]

[discrete]
[[create-index-template]]
=== Step 3. Create an index template

// tag::ds-create-index-template[]
Use your component templates to create an index template. Specify:

* One or more index patterns that match the data stream's name. We recommend
Expand Down Expand Up @@ -196,11 +200,13 @@ PUT _index_template/my-index-template
}
----
// TEST[continued]
// end::ds-create-index-template[]

[discrete]
[[create-data-stream]]
=== Step 4. Create the data stream

// tag::ds-create-data-stream[]
<<add-documents-to-a-data-stream,Indexing requests>> add documents to a data
stream. These requests must use an `op_type` of `create`. Documents must include
a `@timestamp` field.
Expand All @@ -224,6 +230,7 @@ POST my-data-stream/_doc
}
----
// TEST[continued]
// end::ds-create-data-stream[]

You can also manually create the stream using the
<<indices-create-data-stream,create data stream API>>. The stream's name must
Expand Down
4 changes: 3 additions & 1 deletion docs/reference/how-to.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,6 @@ include::how-to/search-speed.asciidoc[]

include::how-to/disk-usage.asciidoc[]

include::how-to/size-your-shards.asciidoc[]
include::how-to/size-your-shards.asciidoc[]

include::how-to/use-elasticsearch-for-time-series-data.asciidoc[]
213 changes: 213 additions & 0 deletions docs/reference/how-to/use-elasticsearch-for-time-series-data.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,213 @@
[[use-elasticsearch-for-time-series-data]]
== Use {es} for time series data

{es} offers features to help you store, manage, and search time series data,
such as logs and metrics. Once in {es}, you can analyze and visualize your data
using {kib} and other {stack} features.

To get the most out of your time series data in {es}, follow these steps:

* <<set-up-data-tiers>>
* <<register-snapshot-repository>>
* <<create-edit-index-lifecycle-policy>>
* <<create-ts-component-templates>>
* <<create-ts-index-template>>
* <<add-data-to-data-stream>>
* <<search-visualize-your-data>>


[discrete]
[[set-up-data-tiers]]
=== Step 1. Set up data tiers

{es}'s <<index-lifecycle-management,{ilm-init}>> feature uses <<data-tiers,data
tiers>> to automatically move older data to nodes with less expensive hardware
as it ages. This helps improve performance and reduce storage costs.

The hot tier is required. The warm, cold, and frozen tiers are optional. Use
high-performance nodes in the hot and warm tiers for faster indexing and faster
searches on your most recent data. Use slower, less expensive nodes in the cold
and frozen tiers to reduce costs.

The steps for setting up data tiers vary based on your deployment type:

include::{es-repo-dir}/tab-widgets/code.asciidoc[]
include::{es-repo-dir}/tab-widgets/data-tiers-widget.asciidoc[]

[discrete]
[[register-snapshot-repository]]
=== Step 2. Register a snapshot repository

The cold and frozen tiers can use <<searchable-snapshots,{search-snaps}>> to
reduce local storage costs.

To use {search-snaps}, you must register a supported snapshot repository. The
steps for registering this repository vary based on your deployment type and
storage provider:

include::{es-repo-dir}/tab-widgets/snapshot-repo-widget.asciidoc[]

[discrete]
[[create-edit-index-lifecycle-policy]]
=== Step 3. Create or edit an index lifecycle policy

A <<data-streams,data stream>> stores your data across multiple backing
indices. {ilm-init} uses an <<ilm-index-lifecycle,index lifecycle policy>> to
automatically move these indices through your data tiers.

If you use {fleet} or {agent}, edit one of {es}'s built-in lifecycle policies.
If you use a custom application, create your own policy. In either case,
ensure your policy:

* Includes a phase for each data tier you've configured.
* Calculates the threshold, or `min_age`, for phase transition from rollover.
* Uses {search-snaps} in the cold and frozen phases, if wanted.
* Includes a delete phase, if needed.

include::{es-repo-dir}/tab-widgets/ilm-widget.asciidoc[]

[discrete]
[[create-ts-component-templates]]
=== Step 4. Create component templates

TIP: If you use {fleet} or {agent}, skip to <<search-visualize-your-data>>.
{fleet} and {agent} use built-in templates to create data streams for you.

If you use a custom application, you need to set up your own data stream.
include::{es-repo-dir}/data-streams/set-up-a-data-stream.asciidoc[tag=ds-create-component-templates]

[discrete]
[[create-ts-index-template]]
=== Step 5. Create an index template

include::{es-repo-dir}/data-streams/set-up-a-data-stream.asciidoc[tag=ds-create-index-template]

[discrete]
[[add-data-to-data-stream]]
=== Step 6. Add data to a data stream

include::{es-repo-dir}/data-streams/set-up-a-data-stream.asciidoc[tag=ds-create-data-stream]

[discrete]
[[search-visualize-your-data]]
=== Step 7. Search and visualize your data

To explore and search your data in {kib}, open the main menu and select
**Discover**. See {kib}'s {kibana-ref}/discover.html[Discover documentation].

Use {kib}'s **Dashboard** feature to visualize your data in a chart, table, map,
and more. See {kib}'s {kibana-ref}/dashboard.html[Dashboard documentation].

You can also search and aggregate your data using the <<search-search,search
API>>. Use <<runtime-search-request,runtime fields>> and <<grok-basics,grok
patterns>> to dynamically extract data from log messages and other unstructured
content at search time.

[source,console]
----
GET my-data-stream/_search
{
"runtime_mappings": {
"source.ip": {
"type": "ip",
"script": """
String sourceip=grok('%{IPORHOST:sourceip} .*').extract(doc[ "message" ].value)?.sourceip;
if (sourceip != null) emit(sourceip);
"""
}
},
"query": {
"bool": {
"filter": [
{
"range": {
"@timestamp": {
"gte": "now-1d/d",
"lt": "now/d"
}
}
},
{
"range": {
"source.ip": {
"gte": "192.0.2.0",
"lte": "192.0.2.255"
}
}
}
]
}
},
"fields": [
"*"
],
"_source": false,
"sort": [
{
"@timestamp": "desc"
},
{
"source.ip": "desc"
}
]
}
----
// TEST[setup:my_data_stream]
// TEST[teardown:data_stream_cleanup]

{es} searches are synchronous by default. Searches across frozen data, long time
ranges, or large datasets may take longer. Use the <<submit-async-search,async
search API>> to run searches in the background. For more search options, see
<<search-your-data>>.

[source,console]
----
POST my-data-stream/_async_search
{
"runtime_mappings": {
"source.ip": {
"type": "ip",
"script": """
String sourceip=grok('%{IPORHOST:sourceip} .*').extract(doc[ "message" ].value)?.sourceip;
if (sourceip != null) emit(sourceip);
"""
}
},
"query": {
"bool": {
"filter": [
{
"range": {
"@timestamp": {
"gte": "now-2y/d",
"lt": "now/d"
}
}
},
{
"range": {
"source.ip": {
"gte": "192.0.2.0",
"lte": "192.0.2.255"
}
}
}
]
}
},
"fields": [
"*"
],
"_source": false,
"sort": [
{
"@timestamp": "desc"
},
{
"source.ip": "desc"
}
]
}
----
// TEST[setup:my_data_stream]
// TEST[teardown:data_stream_cleanup]
38 changes: 8 additions & 30 deletions docs/reference/searchable-snapshots/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,8 @@ For more complex or time-consuming searches, you can use <<async-search>> with
====

[[searchable-snapshots-repository-types]]
You can use any of the following repository types with searchable snapshots:
// tag::searchable-snapshot-repo-types[]
Use any of the following repository types with searchable snapshots:

* {plugins}/repository-s3.html[AWS S3]
* {plugins}/repository-gcs.html[Google Cloud Storage]
Expand All @@ -83,8 +84,9 @@ You can use any of the following repository types with searchable snapshots:
You can also use alternative implementations of these repository types, for
instance
{plugins}/repository-s3-client.html#repository-s3-compatible-services[Minio],
as long as they are fully compatible. You can use the <<repo-analysis-api>> API
as long as they are fully compatible. Use the <<repo-analysis-api>> API
to analyze your repository's suitability for use with searchable snapshots.
// end::searchable-snapshot-repo-types[]

[discrete]
[[how-searchable-snapshots-work]]
Expand Down Expand Up @@ -219,31 +221,7 @@ repository storage then you are responsible for its reliability.
[[searchable-snapshots-frozen-tier-on-cloud]]
=== Configure a frozen tier on {ess}

The frozen data tier is not yet available on {ess-trial}[{ess}]. However,
you can configure another tier to use <<shared-cache,shared snapshot caches>>.
This effectively recreates a frozen tier in your deployment. Follow these
steps:

. Choose an existing tier to use. Typically, you'll use the cold tier, but the
hot and warm tiers are also supported. You can use this tier as a shared tier,
or you can dedicate the tier exclusively to shared snapshot caches.

. Log in to the {ess-trial}[{ess} Console].

. Select your deployment from the {ess} home page or the deployments page.

. From your deployment menu, select **Edit deployment**.

. On the **Edit** page, click **Edit elasticsearch.yml** under your selected
{es} tier.

. In the `elasticsearch.yml` file, add the
<<searchable-snapshots-shared-cache,`xpack.searchable.snapshot.shared_cache.size`>>
setting. For example:
+
[source,yaml]
----
xpack.searchable.snapshot.shared_cache.size: 50GB
----

. Click **Save** and **Confirm** to apply your configuration changes.
The frozen data tier is not yet available on {ess-trial}[{ess}]. However, you
can configure another tier to use <<shared-cache,shared snapshot caches>>. This
effectively recreates a frozen tier in your deployment. See
<<set-up-data-tiers,Set up data tiers>>.
Loading