Skip to content

Commit 0f5d98e

Browse files
debadairDaveCTurnerjrodewig
authored
[DOCS] Add searchable snapshots topic. (#63040) (#64088)
* [DOCS] Add searchable snapshots topic. * [DOCS] Add definitions & remove fully-remote storage. * [DOCS] Fixed duplicate anchor. * Expand conceptual docs for searchable snapshots * Rewordings * Glossary tidy-up * Beta * Reword * More performance idea to a TIP * use -> manage * red -> not green * Missing space? * Update docs/reference/glossary.asciidoc * Fix beta label * Use more attributes, fix link titles * Apply suggestions from code review Co-authored-by: debadair <[email protected]> * Reformat * Minor rewordings * More minor rewordings * Address Henning's comments Co-authored-by: David Turner <[email protected]> Co-authored-by: James Rodewig <[email protected]> Co-authored-by: David Turner <[email protected]> Co-authored-by: James Rodewig <[email protected]>
1 parent 6e61292 commit 0f5d98e

File tree

3 files changed

+122
-3
lines changed

3 files changed

+122
-3
lines changed

docs/reference/glossary.asciidoc

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -417,6 +417,22 @@ This value can be overridden by specifying a `routing` value at index
417417
time, or a <<mapping-routing-field,routing
418418
field>> in the <<glossary-mapping,mapping>>.
419419

420+
[[glossary-searchable-snapshot]] searchable snapshot ::
421+
// tag::searchable-snapshot-def[]
422+
A <<glossary-snapshot, snapshot>> of an index that has been mounted as a
423+
<<glossary-searchable-snapshot-index, searchable snapshot index>> and can be
424+
searched as if it were a regular index.
425+
// end::searchable-snapshot-def[]
426+
427+
[[glossary-searchable-snapshot-index]] searchable snapshot index ::
428+
// tag::searchable-snapshot-index-def[]
429+
An <<glossary-index, index>> whose data is stored in a <<glossary-snapshot,
430+
snapshot>> that resides in a separate <<glossary-snapshot-repository,snapshot
431+
repository>> such as AWS S3. Searchable snapshot indices do not need
432+
<<glossary-replica-shard,replica>> shards for resilience, since their data is
433+
reliably stored outside the cluster.
434+
// end::searchable-snapshot-index-def[]
435+
420436
[[glossary-shard]] shard ::
421437
+
422438
--
@@ -449,9 +465,11 @@ See the {ref}/indices-shrink-index.html[shrink index API].
449465

450466
[[glossary-snapshot]] snapshot ::
451467
// tag::snapshot-def[]
452-
A backup taken from a running {es} cluster.
453-
A snapshot can include backups of an entire cluster or only data streams and
454-
indices you specify.
468+
Captures the state of the whole cluster or of particular indices or data
469+
streams at a particular point in time. Snapshots provide a back up of a running
470+
cluster, ensuring you can restore your data in the event of a failure. You can
471+
also mount indices or datastreams from snapshots as read-only
472+
{ref}/glossary.html#glossary-searchable-snapshot-index[searchable snapshots].
455473
// end::snapshot-def[]
456474

457475
[[glossary-snapshot-lifecycle-policy]] snapshot lifecycle policy ::
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
[[searchable-snapshots]]
2+
== {search-snaps-cap}
3+
4+
beta::[]
5+
6+
{search-snaps-cap} let you reduce your operating costs by using
7+
<<snapshot-restore, snapshots>> for resiliency rather than maintaining
8+
<<scalability,replica shards>> within a cluster. When you mount an index from a
9+
snapshot as a {search-snap}, {es} copies the index shards to local storage
10+
within the cluster. This ensures that search performance is comparable to
11+
searching any other index, and minimizes the need to access the snapshot
12+
repository. Should a node fail, shards of a {search-snap} index are
13+
automatically recovered from the snapshot repository.
14+
15+
This can result in significant cost savings. With {search-snaps}, you may be
16+
able to halve your cluster size without increasing the risk of data loss or
17+
reducing the amount of data you can search. Because {search-snaps} rely on the
18+
same snapshot mechanism you use for backups, they have a minimal impact on your
19+
snapshot repository storage costs.
20+
21+
[discrete]
22+
[[using-searchable-snapshots]]
23+
=== Using {search-snaps}
24+
25+
Searching a {search-snap} index is the same as searching any other index.
26+
Search performance is comparable to regular indices because the shard data is
27+
copied onto nodes in the cluster when the {search-snap} is mounted.
28+
29+
By default, {search-snap} indices have no replicas. The underlying snapshot
30+
provides resilience and the query volume is expected to be low enough that a
31+
single shard copy will be sufficient. However, if you need to support a higher
32+
query volume, you can add replicas by adjusting the `index.number_of_replicas`
33+
index setting.
34+
35+
If a node fails and {search-snap} shards need to be restored from the snapshot,
36+
there is a brief window of time while {es} allocates the shards to other nodes
37+
where the cluster health will not be `green`. Searches that hit these shards
38+
will fail or return partial results until they are reallocated.
39+
40+
You typically manage {search-snaps} through {ilm-init}. The
41+
<<ilm-searchable-snapshot, searchable snapshots>> action automatically converts
42+
an index to a {search-snap} when it reaches the `cold` phase. You can also make
43+
indices in existing snapshots searchable by manually mounting them as
44+
{search-snaps} with the <<searchable-snapshots-api-mount-snapshot, mount
45+
snapshot>> API.
46+
47+
To mount an index from a snapshot that contains multiple indices, we recommend
48+
creating a <<clone-snapshot-api, clone>> of the snapshot that contains only the
49+
index you want to search, and mounting the clone. You cannot delete a snapshot
50+
if it has any mounted indices, so creating a clone enables you to manage the
51+
lifecycle of the backup snapshot independently of any {search-snaps}.
52+
53+
You can control the allocation of the shards of {search-snap} indices using the
54+
same mechanisms as for regular indices. For example, you could use
55+
<<shard-allocation-filtering>> to restrict {search-snap} shards to a subset of
56+
your nodes.
57+
58+
We recommend that you <<indices-forcemerge, force-merge>> indices to a single
59+
segment per shard before taking a snapshot that will be mounted as a
60+
{search-snap} index. Each read from a snapshot repository takes time and costs
61+
money, and the fewer segments there are the fewer reads are needed to restore
62+
the snapshot.
63+
64+
[TIP]
65+
====
66+
{search-snaps-cap} are ideal for managing a large archive of historical data.
67+
Historical information is typically searched less frequently than recent data
68+
and therefore may not need replicas for their performance benefits.
69+
70+
For more complex or time-consuming searches, you can use <<async-search>> with
71+
{search-snaps}.
72+
====
73+
74+
[discrete]
75+
[[how-searchable-snapshots-work]]
76+
=== How {search-snaps} work
77+
78+
When an index is mounted from a snapshot, {es} allocates its shards to data
79+
nodes within the cluster. The data nodes then automatically restore the shard
80+
data from the repository onto local storage. Once the restore process
81+
completes, these shards respond to searches using the data held in local
82+
storage and do not need to access the repository. This avoids incurring the
83+
cost or performance penalty associated with reading data from the repository.
84+
85+
If a node holding one of these shards fails, {es} automatically allocates it to
86+
another node, and that node restores the shard data from the repository. No
87+
replicas are needed, and no complicated monitoring or orchestration is
88+
necessary to restore lost shards.
89+
90+
{es} restores {search-snap} shards in the background and you can search them
91+
even if they have not been fully restored. If a search hits a {search-snap}
92+
shard before it has been fully restored, {es} eagerly retrieves the data needed
93+
for the search. If a shard is freshly allocated to a node and still warming up,
94+
some searches will be slower. However, searches typically access a very small
95+
fraction of the total shard data so the performance penalty is typically small.
96+
97+
Replicas of {search-snaps} shards are restored by copying data from the
98+
snapshot repository. In contrast, replicas of regular indices are restored by
99+
copying data from the primary.

docs/reference/snapshot-restore/index.asciidoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,3 +112,5 @@ include::restore-snapshot.asciidoc[]
112112
include::monitor-snapshot-restore.asciidoc[]
113113
include::delete-snapshot.asciidoc[]
114114
include::../slm/index.asciidoc[]
115+
include::../searchable-snapshots/index.asciidoc[]
116+

0 commit comments

Comments
 (0)