1
1
[[modules-snapshots]]
2
2
== Snapshot And Restore
3
3
4
- You can store snapshots of individual indices or an entire cluster in
5
- a remote repository like a shared file system, S3, or HDFS. These snapshots
6
- are great for backups because they can be restored relatively quickly. However,
7
- snapshots can only be restored to versions of Elasticsearch that can read the
8
- indices:
4
+ A snapshot is a backup taken from a running Elasticsearch cluster. You can take
5
+ a snapshot of individual indices or of the entire cluster and store it in a
6
+ repository on a shared filesystem, and there are plugins that support remote
7
+ repositories on S3, HDFS, Azure, Google Cloud Storage and more.
8
+
9
+ Snapshots are taken incrementally. This means that when creating a snapshot of
10
+ an index Elasticsearch will avoid copying any data that is already stored in
11
+ the repository as part of an earlier snapshot of the same index. Therefore it
12
+ can be efficient to take snapshots of your cluster quite frequently.
13
+
14
+ Snapshots can be restored into a running cluster via the restore API. When
15
+ restoring an index it is possible to alter the name of the restored index as
16
+ well as some of its settings, allowing a great deal of flexibility in how the
17
+ snapshot and restore functionality can be used.
18
+
19
+ WARNING: It is not possible to back up an Elasticsearch cluster simply by
20
+ taking a copy of the data directories of all of its nodes. Elasticsearch may be
21
+ making changes to the contents of its data directories while it is running, and
22
+ this means that copying its data directories cannot be expected to capture a
23
+ consistent picture of their contents. Attempts to restore a cluster from such a
24
+ backup may fail, reporting corruption and/or missing files, or may appear to
25
+ have succeeded having silently lost some of its data. The only reliable way to
26
+ back up a cluster is by using the snapshot and restore functionality.
27
+
28
+ [float]
29
+ === Version compatibility
30
+
31
+ A snapshot contains a copy of the on-disk data structures that make up an
32
+ index. This means that snapshots can only be restored to versions of
33
+ Elasticsearch that can read the indices:
9
34
10
35
* A snapshot of an index created in 5.x can be restored to 6.x.
11
36
* A snapshot of an index created in 2.x can be restored to 5.x.
12
37
* A snapshot of an index created in 1.x can be restored to 2.x.
13
38
14
- Conversely, snapshots of indices created in 1.x **cannot** be restored to
15
- 5.x or 6.x, and snapshots of indices created in 2.x **cannot** be restored
16
- to 6.x.
39
+ Conversely, snapshots of indices created in 1.x **cannot** be restored to 5.x
40
+ or 6.x, and snapshots of indices created in 2.x **cannot** be restored to 6.x.
17
41
18
- Snapshots are incremental and can contain indices created in various
19
- versions of Elasticsearch. If any indices in a snapshot were created in an
42
+ Each snapshot can contain indices created in various versions of Elasticsearch,
43
+ and when restoring a snapshot it must be possible to restore all of the indices
44
+ into the target cluster. If any indices in a snapshot were created in an
20
45
incompatible version, you will not be able restore the snapshot.
21
46
22
47
IMPORTANT: When backing up your data prior to an upgrade, keep in mind that you
@@ -28,8 +53,8 @@ that is incompatible with the version of the cluster you are currently running,
28
53
you can restore it on the latest compatible version and use
29
54
<<reindex-from-remote,reindex-from-remote>> to rebuild the index on the current
30
55
version. Reindexing from remote is only possible if the original index has
31
- source enabled. Retrieving and reindexing the data can take significantly longer
32
- than simply restoring a snapshot. If you have a large amount of data, we
56
+ source enabled. Retrieving and reindexing the data can take significantly
57
+ longer than simply restoring a snapshot. If you have a large amount of data, we
33
58
recommend testing the reindex from remote process with a subset of your data to
34
59
understand the time requirements before proceeding.
35
60
0 commit comments