1
1
[[searchable-snapshots]]
2
2
== {search-snaps-cap}
3
3
4
- {search-snaps-cap} let you reduce your operating costs by using
5
- <<snapshot-restore, snapshots>> for resiliency rather than maintaining
6
- <<scalability,replica shards>> within a cluster. When you mount an index from a
7
- snapshot as a {search-snap}, {es} copies the index shards to local storage
8
- within the cluster. This ensures that search performance is comparable to
9
- searching any other index, and minimizes the need to access the snapshot
10
- repository. Should a node fail, shards of a {search-snap} index are
11
- automatically recovered from the snapshot repository.
12
-
13
- This can result in significant cost savings for less frequently searched data.
14
- With {search-snaps}, you no longer need an extra index shard copy to avoid data
15
- loss, potentially halving the node local storage capacity necessary for
16
- searching that data. Because {search-snaps} rely on the same snapshot mechanism
17
- you use for backups, they have a minimal impact on your snapshot repository
18
- storage costs.
4
+ {search-snaps-cap} let you use <<snapshot-restore,snapshots>> to search
5
+ infrequently accessed and read-only data in a very cost-effective fashion. The
6
+ <<cold-tier,cold>> and <<frozen-tier,frozen>> data tiers use {search-snaps} to
7
+ reduce your storage and operating costs.
8
+
9
+ {search-snaps-cap} eliminate the need for <<scalability,replica shards>>,
10
+ potentially halving the local storage needed to search your data.
11
+ {search-snaps-cap} rely on the same snapshot mechanism you already use for
12
+ backups and have minimal impact on your snapshot repository storage costs.
19
13
20
14
[discrete]
21
15
[[using-searchable-snapshots]]
22
16
=== Using {search-snaps}
23
17
24
18
Searching a {search-snap} index is the same as searching any other index.
25
- Search performance is comparable to regular indices because the shard data is
26
- copied onto nodes in the cluster when the {search-snap} is mounted.
27
19
28
20
By default, {search-snap} indices have no replicas. The underlying snapshot
29
21
provides resilience and the query volume is expected to be low enough that a
30
22
single shard copy will be sufficient. However, if you need to support a higher
31
23
query volume, you can add replicas by adjusting the `index.number_of_replicas`
32
24
index setting.
33
25
34
- If a node fails and {search-snap} shards need to be restored from the snapshot,
35
- there is a brief window of time while {es} allocates the shards to other nodes
36
- where the cluster health will not be `green`. Searches that hit these shards
37
- will fail or return partial results until the shards are reallocated to healthy
38
- nodes.
26
+ If a node fails and {search-snap} shards need to be recovered elsewhere, there
27
+ is a brief window of time while {es} allocates the shards to other nodes where
28
+ the cluster health will not be `green`. Searches that hit these shards may fail
29
+ or return partial results until the shards are reallocated to healthy nodes.
39
30
40
31
You typically manage {search-snaps} through {ilm-init}. The
41
32
<<ilm-searchable-snapshot, searchable snapshots>> action automatically converts
42
- a regular index into a {search-snap} index when it reaches the `cold` phase.
43
- You can also make indices in existing snapshots searchable by manually mounting
44
- them as {search-snap} indices with the
45
- <<searchable-snapshots-api-mount-snapshot, mount snapshot>> API.
33
+ a regular index into a {search-snap} index when it reaches the `cold` or
34
+ `frozen` phase. You can also make indices in existing snapshots searchable by
35
+ manually mounting them using the <<searchable-snapshots-api-mount-snapshot,
36
+ mount snapshot>> API.
46
37
47
38
To mount an index from a snapshot that contains multiple indices, we recommend
48
39
creating a <<clone-snapshot-api, clone>> of the snapshot that contains only the
49
40
index you want to search, and mounting the clone. You should not delete a
50
41
snapshot if it has any mounted indices, so creating a clone enables you to
51
42
manage the lifecycle of the backup snapshot independently of any
52
- {search-snaps}.
43
+ {search-snaps}. If you use {ilm-init} to manage your {search-snaps} then it
44
+ will automatically look after cloning the snapshot as needed.
53
45
54
46
You can control the allocation of the shards of {search-snap} indices using the
55
47
same mechanisms as for regular indices. For example, you could use
@@ -60,7 +52,7 @@ We recommend that you <<indices-forcemerge, force-merge>> indices to a single
60
52
segment per shard before taking a snapshot that will be mounted as a
61
53
{search-snap} index. Each read from a snapshot repository takes time and costs
62
54
money, and the fewer segments there are the fewer reads are needed to restore
63
- the snapshot.
55
+ the snapshot or to respond to a search .
64
56
65
57
[TIP]
66
58
====
@@ -84,35 +76,104 @@ You can use any of the following repository types with searchable snapshots:
84
76
You can also use alternative implementations of these repository types, for
85
77
instance
86
78
{plugins}/repository-s3-client.html#repository-s3-compatible-services[Minio],
87
- as long as they are fully compatible.
79
+ as long as they are fully compatible. You can use the <<repo-analysis-api>> API
80
+ to analyze your repository's suitability for use with searchable snapshots.
88
81
89
82
[discrete]
90
83
[[how-searchable-snapshots-work]]
91
84
=== How {search-snaps} work
92
85
93
86
When an index is mounted from a snapshot, {es} allocates its shards to data
94
- nodes within the cluster. The data nodes then automatically restore the shard
95
- data from the repository onto local storage. Once the restore process
96
- completes, these shards respond to searches using the data held in local
97
- storage and do not need to access the repository. This avoids incurring the
98
- cost or performance penalty associated with reading data from the repository.
99
-
100
- If a node holding one of these shards fails, {es} automatically allocates it to
101
- another node, and that node restores the shard data from the repository. No
102
- replicas are needed, and no complicated monitoring or orchestration is
103
- necessary to restore lost shards.
104
-
105
- {es} restores {search-snap} shards in the background and you can search them
106
- even if they have not been fully restored. If a search hits a {search-snap}
107
- shard before it has been fully restored, {es} eagerly retrieves the data needed
108
- for the search. If a shard is freshly allocated to a node and still warming up,
109
- some searches will be slower. However, searches typically access a very small
110
- fraction of the total shard data so the performance penalty is typically small.
111
-
112
- Replicas of {search-snaps} shards are restored by copying data from the
113
- snapshot repository. In contrast, replicas of regular indices are restored by
87
+ nodes within the cluster. The data nodes then automatically retrieve the
88
+ relevant shard data from the repository onto local storage, based on the
89
+ <<searchable-snapshot-mount-storage-options,mount options>> specified. If
90
+ possible, searches use data from local storage. If the data is not available
91
+ locally, {es} downloads the data that it needs from the snapshot repository.
92
+
93
+ If a node holding one of these shards fails, {es} automatically allocates the
94
+ affected shards on another node, and that node restores the relevant shard data
95
+ from the repository. No replicas are needed, and no complicated monitoring or
96
+ orchestration is necessary to restore lost shards. Although searchable snapshot
97
+ indices have no replicas by default, you may add replicas to these indices by
98
+ adjusting `index.number_of_replicas`. Replicas of {search-snap} shards are
99
+ recovered by copying data from the snapshot repository, just like primaries of
100
+ {search-snap} shards. In contrast, replicas of regular indices are restored by
114
101
copying data from the primary.
115
102
103
+ [discrete]
104
+ [[searchable-snapshot-mount-storage-options]]
105
+ ==== Mount options
106
+
107
+ To search a snapshot, you must first mount it locally as an index. Usually
108
+ {ilm-init} will do this automatically, but you can also call the
109
+ <<searchable-snapshots-api-mount-snapshot,mount snapshot>> API yourself. There
110
+ are two options for mounting a snapshot, each with different performance
111
+ characteristics and local storage footprints:
112
+
113
+ [[full-copy]]
114
+ Full copy::
115
+ Loads a full copy of the snapshotted index's shards onto node-local storage
116
+ within the cluster. This is the default mount option. {ilm-init} uses this
117
+ option by default in the `hot` and `cold` phases.
118
+ +
119
+ Search performance for a full-copy searchable snapshot index is normally
120
+ comparable to a regular index, since there is minimal need to access the
121
+ snapshot repository. While recovery is ongoing, search performance may be
122
+ slower than with a regular index because a search may need some data that has
123
+ not yet been retrieved into the local copy. If that happens, {es} will eagerly
124
+ retrieve the data needed to complete the search in parallel with the ongoing
125
+ recovery.
126
+
127
+ [[shared-cache]]
128
+ Shared cache::
129
+ +
130
+ experimental::[]
131
+ +
132
+ Uses a local cache containing only recently searched parts of the snapshotted
133
+ index's data. {ilm-init} uses this option by default in the `frozen` phase and
134
+ corresponding frozen tier.
135
+ +
136
+ If a search requires data that is not in the cache, {es} fetches the missing
137
+ data from the snapshot repository. Searches that require these fetches are
138
+ slower, but the fetched data is stored in the cache so that similar searches
139
+ can be served more quickly in future. {es} will evict infrequently used data
140
+ from the cache to free up space.
141
+ +
142
+ Although slower than a full local copy or a regular index, a shared-cache
143
+ searchable snapshot index still returns search results quickly, even for large
144
+ data sets, because the layout of data in the repository is heavily optimized
145
+ for search. Many searches will need to retrieve only a small subset of the
146
+ total shard data before returning results.
147
+
148
+ To mount a searchable snapshot index with the shared cache mount option, you
149
+ must configure the `xpack.searchable.snapshot.shared_cache.size` setting to
150
+ reserve space for the cache on one or more nodes. Indices mounted with the
151
+ shared cache mount option are only allocated to nodes that have this setting
152
+ configured.
153
+
154
+ [[searchable-snapshots-shared-cache]]
155
+ `xpack.searchable.snapshot.shared_cache.size`::
156
+ (<<static-cluster-setting,Static>>, <<byte-units,byte value>>)
157
+ The size of the space reserved for the shared cache. Defaults to `0b`, meaning
158
+ that the node has no shared cache.
159
+
160
+ You can configure the setting in `elasticsearch.yml`:
161
+
162
+ [source,yaml]
163
+ ----
164
+ xpack.searchable.snapshot.shared_cache.size: 4TB
165
+ ----
166
+
167
+ IMPORTANT: Currently, you can configure
168
+ `xpack.searchable.snapshot.shared_cache.size` on any node. In a future release,
169
+ you will only be able to configure this setting on nodes with the
170
+ <<data-frozen-node,`data_frozen`>> role.
171
+
172
+ You can set `xpack.searchable.snapshot.shared_cache.size` to any size between a
173
+ couple of gigabytes up to 90% of available disk space. We only recommend higher
174
+ sizes if you use the node exclusively on a frozen tier or for searchable
175
+ snapshots.
176
+
116
177
[discrete]
117
178
[[back-up-restore-searchable-snapshots]]
118
179
=== Back up and restore {search-snaps}
@@ -150,18 +211,34 @@ very good protection against data loss or corruption. If you manage your own
150
211
repository storage then you are responsible for its reliability.
151
212
152
213
[discrete]
153
- [[searchable-snapshots-shared-cache ]]
154
- === Shared snapshot cache
214
+ [[searchable-snapshots-frozen-tier-on-cloud ]]
215
+ === Configure a frozen tier on the {ess}
155
216
156
- experimental::[]
217
+ The frozen data tier is not yet available on the {ess-trial}[{ess}]. However,
218
+ you can configure another tier to use <<shared-cache,shared snapshot caches>>.
219
+ This effectively recreates a frozen tier in your {ess} deployment. Follow these
220
+ steps:
157
221
158
- By default a {search-snap} copies the whole snapshot into the local cluster as
159
- described above. You can also configure a shared snapshot cache which is used
160
- to hold a copy of just the frequently-accessed parts of shards of indices which
161
- are mounted with `?storage=shared_cache`. If you configure a node to have a
162
- shared cache then that node will reserve space for the cache when it starts up.
222
+ . Choose an existing tier to use. Typically, you'll use the cold tier, but the
223
+ hot and warm tiers are also supported. You can use this tier as a shared tier,
224
+ or you can dedicate the tier exclusively to shared snapshot caches.
163
225
164
- `xpack.searchable.snapshot.shared_cache.size`::
165
- (<<static-cluster-setting,Static>>, <<byte-units,byte value>>)
166
- The size of the space reserved for the shared cache. Defaults to `0b`, meaning
167
- that the node has no shared cache.
226
+ . Log in to the {ess-trial}[{ess} Console].
227
+
228
+ . Select your deployment from the {ess} home page or the deployments page.
229
+
230
+ . From your deployment menu, select **Edit deployment**.
231
+
232
+ . On the **Edit** page, click **Edit elasticsearch.yml** under your selected
233
+ {es} tier.
234
+
235
+ . In the `elasticsearch.yml` file, add the
236
+ <<searchable-snapshots-shared-cache,`xpack.searchable.snapshot.shared_cache.size`>>
237
+ setting. For example:
238
+ +
239
+ [source,yaml]
240
+ ----
241
+ xpack.searchable.snapshot.shared_cache.size: 50GB
242
+ ----
243
+
244
+ . Click **Save** and **Confirm** to apply your configuration changes.
0 commit comments