|
| 1 | +[[searchable-snapshots]] |
| 2 | +== {search-snaps-cap} |
| 3 | + |
| 4 | +beta::[] |
| 5 | + |
| 6 | +{search-snaps-cap} let you reduce your operating costs by using |
| 7 | +<<snapshot-restore, snapshots>> for resiliency rather than maintaining |
| 8 | +<<scalability,replica shards>> within a cluster. When you mount an index from a |
| 9 | +snapshot as a {search-snap}, {es} copies the index shards to local storage |
| 10 | +within the cluster. This ensures that search performance is comparable to |
| 11 | +searching any other index, and minimizes the need to access the snapshot |
| 12 | +repository. Should a node fail, shards of a {search-snap} index are |
| 13 | +automatically recovered from the snapshot repository. |
| 14 | + |
| 15 | +This can result in significant cost savings. With {search-snaps}, you may be |
| 16 | +able to halve your cluster size without increasing the risk of data loss or |
| 17 | +reducing the amount of data you can search. Because {search-snaps} rely on the |
| 18 | +same snapshot mechanism you use for backups, they have a minimal impact on your |
| 19 | +snapshot repository storage costs. |
| 20 | + |
| 21 | +[discrete] |
| 22 | +[[using-searchable-snapshots]] |
| 23 | +=== Using {search-snaps} |
| 24 | + |
| 25 | +Searching a {search-snap} index is the same as searching any other index. |
| 26 | +Search performance is comparable to regular indices because the shard data is |
| 27 | +copied onto nodes in the cluster when the {search-snap} is mounted. |
| 28 | + |
| 29 | +By default, {search-snap} indices have no replicas. The underlying snapshot |
| 30 | +provides resilience and the query volume is expected to be low enough that a |
| 31 | +single shard copy will be sufficient. However, if you need to support a higher |
| 32 | +query volume, you can add replicas by adjusting the `index.number_of_replicas` |
| 33 | +index setting. |
| 34 | + |
| 35 | +If a node fails and {search-snap} shards need to be restored from the snapshot, |
| 36 | +there is a brief window of time while {es} allocates the shards to other nodes |
| 37 | +where the cluster health will not be `green`. Searches that hit these shards |
| 38 | +will fail or return partial results until they are reallocated. |
| 39 | + |
| 40 | +You typically manage {search-snaps} through {ilm-init}. The |
| 41 | +<<ilm-searchable-snapshot, searchable snapshots>> action automatically converts |
| 42 | +an index to a {search-snap} when it reaches the `cold` phase. You can also make |
| 43 | +indices in existing snapshots searchable by manually mounting them as |
| 44 | +{search-snaps} with the <<searchable-snapshots-api-mount-snapshot, mount |
| 45 | +snapshot>> API. |
| 46 | + |
| 47 | +To mount an index from a snapshot that contains multiple indices, we recommend |
| 48 | +creating a <<clone-snapshot-api, clone>> of the snapshot that contains only the |
| 49 | +index you want to search, and mounting the clone. You cannot delete a snapshot |
| 50 | +if it has any mounted indices, so creating a clone enables you to manage the |
| 51 | +lifecycle of the backup snapshot independently of any {search-snaps}. |
| 52 | + |
| 53 | +You can control the allocation of the shards of {search-snap} indices using the |
| 54 | +same mechanisms as for regular indices. For example, you could use |
| 55 | +<<shard-allocation-filtering>> to restrict {search-snap} shards to a subset of |
| 56 | +your nodes. |
| 57 | + |
| 58 | +We recommend that you <<indices-forcemerge, force-merge>> indices to a single |
| 59 | +segment per shard before taking a snapshot that will be mounted as a |
| 60 | +{search-snap} index. Each read from a snapshot repository takes time and costs |
| 61 | +money, and the fewer segments there are the fewer reads are needed to restore |
| 62 | +the snapshot. |
| 63 | + |
| 64 | +[TIP] |
| 65 | +==== |
| 66 | +{search-snaps-cap} are ideal for managing a large archive of historical data. |
| 67 | +Historical information is typically searched less frequently than recent data |
| 68 | +and therefore may not need replicas for their performance benefits. |
| 69 | +
|
| 70 | +For more complex or time-consuming searches, you can use <<async-search>> with |
| 71 | +{search-snaps}. |
| 72 | +==== |
| 73 | + |
| 74 | +[discrete] |
| 75 | +[[how-searchable-snapshots-work]] |
| 76 | +=== How {search-snaps} work |
| 77 | + |
| 78 | +When an index is mounted from a snapshot, {es} allocates its shards to data |
| 79 | +nodes within the cluster. The data nodes then automatically restore the shard |
| 80 | +data from the repository onto local storage. Once the restore process |
| 81 | +completes, these shards respond to searches using the data held in local |
| 82 | +storage and do not need to access the repository. This avoids incurring the |
| 83 | +cost or performance penalty associated with reading data from the repository. |
| 84 | + |
| 85 | +If a node holding one of these shards fails, {es} automatically allocates it to |
| 86 | +another node, and that node restores the shard data from the repository. No |
| 87 | +replicas are needed, and no complicated monitoring or orchestration is |
| 88 | +necessary to restore lost shards. |
| 89 | + |
| 90 | +{es} restores {search-snap} shards in the background and you can search them |
| 91 | +even if they have not been fully restored. If a search hits a {search-snap} |
| 92 | +shard before it has been fully restored, {es} eagerly retrieves the data needed |
| 93 | +for the search. If a shard is freshly allocated to a node and still warming up, |
| 94 | +some searches will be slower. However, searches typically access a very small |
| 95 | +fraction of the total shard data so the performance penalty is typically small. |
| 96 | + |
| 97 | +Replicas of {search-snaps} shards are restored by copying data from the |
| 98 | +snapshot repository. In contrast, replicas of regular indices are restored by |
| 99 | +copying data from the primary. |
0 commit comments