Skip to content

Commit c783488

Browse files
authored
Add _source-only snapshot repository (#32844)
This change adds a `_source` only snapshot repository that allows to wrap any existing repository as a _backend_ to snapshot only the `_source` part including live docs markers. Snapshots taken with the `source` repository won't include any indices, doc-values or points. The snapshot will be reduced in size and functionality such that it requires full re-indexing after it's successfully restored. The restore process will copy the `_source` data locally starts a special shard and engine to allow `match_all` scrolls and searches. Any other query, or get call will fail with and unsupported operation exception. The restored index is also marked as read-only. This feature aims mainly for disaster recovery use-cases where snapshot size is a concern or where time to restore is less of an issue. **NOTE**: The snapshot produced by this repository is still a valid lucene index. This change doesn't allow for any longer retention policies which is out of scope for this change.
1 parent 141c6ef commit c783488

File tree

25 files changed

+1885
-25
lines changed

25 files changed

+1885
-25
lines changed

docs/reference/modules/snapshots.asciidoc

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,51 @@ repositories.url.allowed_urls: ["http://www.example.org/root/*", "https://*.mydo
207207
URL repositories with `file:` URLs can only point to locations registered in the `path.repo` setting similar to
208208
shared file system repository.
209209

210+
[float]
211+
[role="xpack"]
212+
[testenv="basic"]
213+
===== Source Only Repository
214+
215+
A source repository enables you to create minimal, source-only snapshots that take up to 50% less space on disk.
216+
Source only snapshots contain stored fields and index metadata. They do not include index or doc values structures
217+
and are not searchable when restored. After restoring a source-only snapshot, you must <<docs-reindex,reindex>>
218+
the data into a new index.
219+
220+
Source repositories delegate to another snapshot repository for storage.
221+
222+
223+
[IMPORTANT]
224+
==================================================
225+
226+
Source only snapshots are only supported if the `_source` field is enabled and no source-filtering is applied.
227+
When you restore a source only snapshot:
228+
229+
* The restored index is read-only and can only serve `match_all` search or scroll requests to enable reindexing.
230+
231+
* Queries other than `match_all` and `_get` requests are not supported.
232+
233+
* The mapping of the restored index is empty, but the original mapping is available from the types top
234+
level `meta` element.
235+
236+
==================================================
237+
238+
When you create a source repository, you must specify the type and name of the delegate repository
239+
where the snapshots will be stored:
240+
241+
[source,js]
242+
-----------------------------------
243+
PUT _snapshot/my_src_only_repository
244+
{
245+
"type": "source",
246+
"settings": {
247+
"delegate_type": "fs",
248+
"location": "my_backup_location"
249+
}
250+
}
251+
-----------------------------------
252+
// CONSOLE
253+
// TEST[continued]
254+
210255
[float]
211256
===== Repository plugins
212257

libs/core/src/main/java/org/elasticsearch/core/internal/io/IOUtils.java

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
import java.io.Closeable;
2121
import java.io.IOException;
2222
import java.nio.channels.FileChannel;
23+
import java.nio.charset.StandardCharsets;
2324
import java.nio.file.FileVisitResult;
2425
import java.nio.file.FileVisitor;
2526
import java.nio.file.Files;
@@ -36,6 +37,14 @@
3637
*/
3738
public final class IOUtils {
3839

40+
/**
41+
* UTF-8 charset string.
42+
* <p>Where possible, use {@link StandardCharsets#UTF_8} instead,
43+
* as using the String constant may slow things down.
44+
* @see StandardCharsets#UTF_8
45+
*/
46+
public static final String UTF_8 = StandardCharsets.UTF_8.name();
47+
3948
private IOUtils() {
4049
// Static utils methods
4150
}

server/src/main/java/org/elasticsearch/index/engine/Engine.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1594,7 +1594,7 @@ public static class IndexCommitRef implements Closeable {
15941594
private final CheckedRunnable<IOException> onClose;
15951595
private final IndexCommit indexCommit;
15961596

1597-
IndexCommitRef(IndexCommit indexCommit, CheckedRunnable<IOException> onClose) {
1597+
public IndexCommitRef(IndexCommit indexCommit, CheckedRunnable<IOException> onClose) {
15981598
this.indexCommit = indexCommit;
15991599
this.onClose = onClose;
16001600
}

server/src/main/java/org/elasticsearch/index/engine/EngineFactory.java

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
/**
2222
* Simple Engine Factory
2323
*/
24+
@FunctionalInterface
2425
public interface EngineFactory {
2526

2627
Engine newReadWriteEngine(EngineConfig config);

server/src/main/java/org/elasticsearch/index/seqno/SeqNoStats.java

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -91,5 +91,4 @@ public String toString() {
9191
", globalCheckpoint=" + globalCheckpoint +
9292
'}';
9393
}
94-
9594
}

server/src/main/java/org/elasticsearch/index/shard/AbstractIndexShardComponent.java

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,4 @@ public IndexSettings indexSettings() {
5151
public String nodeName() {
5252
return indexSettings.getNodeName();
5353
}
54-
55-
5654
}

server/src/main/java/org/elasticsearch/index/store/Store.java

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1439,11 +1439,28 @@ public void createEmpty() throws IOException {
14391439
*/
14401440
public void bootstrapNewHistory() throws IOException {
14411441
metadataLock.writeLock().lock();
1442-
try (IndexWriter writer = newIndexWriter(IndexWriterConfig.OpenMode.APPEND, directory, null)) {
1443-
final Map<String, String> userData = getUserData(writer);
1442+
try {
1443+
Map<String, String> userData = readLastCommittedSegmentsInfo().getUserData();
14441444
final long maxSeqNo = Long.parseLong(userData.get(SequenceNumbers.MAX_SEQ_NO));
1445+
bootstrapNewHistory(maxSeqNo);
1446+
} finally {
1447+
metadataLock.writeLock().unlock();
1448+
}
1449+
}
1450+
1451+
/**
1452+
* Marks an existing lucene index with a new history uuid and sets the given maxSeqNo as the local checkpoint
1453+
* as well as the maximum sequence number.
1454+
* This is used to make sure no existing shard will recovery from this index using ops based recovery.
1455+
* @see SequenceNumbers#LOCAL_CHECKPOINT_KEY
1456+
* @see SequenceNumbers#MAX_SEQ_NO
1457+
*/
1458+
public void bootstrapNewHistory(long maxSeqNo) throws IOException {
1459+
metadataLock.writeLock().lock();
1460+
try (IndexWriter writer = newIndexWriter(IndexWriterConfig.OpenMode.APPEND, directory, null)) {
14451461
final Map<String, String> map = new HashMap<>();
14461462
map.put(Engine.HISTORY_UUID_KEY, UUIDs.randomBase64UUID());
1463+
map.put(SequenceNumbers.MAX_SEQ_NO, Long.toString(maxSeqNo));
14471464
map.put(SequenceNumbers.LOCAL_CHECKPOINT_KEY, Long.toString(maxSeqNo));
14481465
updateCommitData(writer, map);
14491466
} finally {

server/src/main/java/org/elasticsearch/indices/IndicesService.java

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -396,7 +396,6 @@ public boolean hasIndex(Index index) {
396396
public IndexService indexService(Index index) {
397397
return indices.get(index.getUUID());
398398
}
399-
400399
/**
401400
* Returns an IndexService for the specified index if exists otherwise a {@link IndexNotFoundException} is thrown.
402401
*/
Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
/*
2+
* Licensed to Elasticsearch under one or more contributor
3+
* license agreements. See the NOTICE file distributed with
4+
* this work for additional information regarding copyright
5+
* ownership. Elasticsearch licenses this file to you under
6+
* the Apache License, Version 2.0 (the "License"); you may
7+
* not use this file except in compliance with the License.
8+
* You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing,
13+
* software distributed under the License is distributed on an
14+
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
15+
* KIND, either express or implied. See the License for the
16+
* specific language governing permissions and limitations
17+
* under the License.
18+
*/
19+
package org.elasticsearch.repositories;
20+
21+
import org.apache.lucene.index.IndexCommit;
22+
import org.elasticsearch.Version;
23+
import org.elasticsearch.cluster.metadata.IndexMetaData;
24+
import org.elasticsearch.cluster.metadata.MetaData;
25+
import org.elasticsearch.cluster.metadata.RepositoryMetaData;
26+
import org.elasticsearch.cluster.node.DiscoveryNode;
27+
import org.elasticsearch.common.component.Lifecycle;
28+
import org.elasticsearch.common.component.LifecycleListener;
29+
import org.elasticsearch.index.shard.IndexShard;
30+
import org.elasticsearch.index.shard.ShardId;
31+
import org.elasticsearch.index.snapshots.IndexShardSnapshotStatus;
32+
import org.elasticsearch.index.store.Store;
33+
import org.elasticsearch.indices.recovery.RecoveryState;
34+
import org.elasticsearch.snapshots.SnapshotId;
35+
import org.elasticsearch.snapshots.SnapshotInfo;
36+
import org.elasticsearch.snapshots.SnapshotShardFailure;
37+
38+
import java.io.IOException;
39+
import java.util.List;
40+
41+
public class FilterRepository implements Repository {
42+
43+
private final Repository in;
44+
45+
public FilterRepository(Repository in) {
46+
this.in = in;
47+
}
48+
49+
@Override
50+
public RepositoryMetaData getMetadata() {
51+
return in.getMetadata();
52+
}
53+
54+
@Override
55+
public SnapshotInfo getSnapshotInfo(SnapshotId snapshotId) {
56+
return in.getSnapshotInfo(snapshotId);
57+
}
58+
59+
@Override
60+
public MetaData getSnapshotGlobalMetaData(SnapshotId snapshotId) {
61+
return in.getSnapshotGlobalMetaData(snapshotId);
62+
}
63+
64+
@Override
65+
public IndexMetaData getSnapshotIndexMetaData(SnapshotId snapshotId, IndexId index) throws IOException {
66+
return in.getSnapshotIndexMetaData(snapshotId, index);
67+
}
68+
69+
@Override
70+
public RepositoryData getRepositoryData() {
71+
return in.getRepositoryData();
72+
}
73+
74+
@Override
75+
public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData metaData) {
76+
in.initializeSnapshot(snapshotId, indices, metaData);
77+
}
78+
79+
@Override
80+
public SnapshotInfo finalizeSnapshot(SnapshotId snapshotId, List<IndexId> indices, long startTime, String failure, int totalShards,
81+
List<SnapshotShardFailure> shardFailures, long repositoryStateId, boolean includeGlobalState) {
82+
return in.finalizeSnapshot(snapshotId, indices, startTime, failure, totalShards, shardFailures, repositoryStateId,
83+
includeGlobalState);
84+
}
85+
86+
@Override
87+
public void deleteSnapshot(SnapshotId snapshotId, long repositoryStateId) {
88+
in.deleteSnapshot(snapshotId, repositoryStateId);
89+
}
90+
91+
@Override
92+
public long getSnapshotThrottleTimeInNanos() {
93+
return in.getSnapshotThrottleTimeInNanos();
94+
}
95+
96+
@Override
97+
public long getRestoreThrottleTimeInNanos() {
98+
return in.getRestoreThrottleTimeInNanos();
99+
}
100+
101+
@Override
102+
public String startVerification() {
103+
return in.startVerification();
104+
}
105+
106+
@Override
107+
public void endVerification(String verificationToken) {
108+
in.endVerification(verificationToken);
109+
}
110+
111+
@Override
112+
public void verify(String verificationToken, DiscoveryNode localNode) {
113+
in.verify(verificationToken, localNode);
114+
}
115+
116+
@Override
117+
public boolean isReadOnly() {
118+
return in.isReadOnly();
119+
}
120+
121+
@Override
122+
public void snapshotShard(IndexShard shard, Store store, SnapshotId snapshotId, IndexId indexId, IndexCommit snapshotIndexCommit,
123+
IndexShardSnapshotStatus snapshotStatus) {
124+
in.snapshotShard(shard, store, snapshotId, indexId, snapshotIndexCommit, snapshotStatus);
125+
}
126+
127+
@Override
128+
public void restoreShard(IndexShard shard, SnapshotId snapshotId, Version version, IndexId indexId, ShardId snapshotShardId,
129+
RecoveryState recoveryState) {
130+
in.restoreShard(shard, snapshotId, version, indexId, snapshotShardId, recoveryState);
131+
}
132+
133+
@Override
134+
public IndexShardSnapshotStatus getShardSnapshotStatus(SnapshotId snapshotId, Version version, IndexId indexId, ShardId shardId) {
135+
return in.getShardSnapshotStatus(snapshotId, version, indexId, shardId);
136+
}
137+
138+
@Override
139+
public Lifecycle.State lifecycleState() {
140+
return in.lifecycleState();
141+
}
142+
143+
@Override
144+
public void addLifecycleListener(LifecycleListener listener) {
145+
in.addLifecycleListener(listener);
146+
}
147+
148+
@Override
149+
public void removeLifecycleListener(LifecycleListener listener) {
150+
in.removeLifecycleListener(listener);
151+
}
152+
153+
@Override
154+
public void start() {
155+
in.start();
156+
}
157+
158+
@Override
159+
public void stop() {
160+
in.stop();
161+
}
162+
163+
@Override
164+
public void close() {
165+
in.close();
166+
}
167+
}

server/src/main/java/org/elasticsearch/repositories/RepositoriesService.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -398,7 +398,7 @@ private Repository createRepository(RepositoryMetaData repositoryMetaData) {
398398
"repository type [" + repositoryMetaData.type() + "] does not exist");
399399
}
400400
try {
401-
Repository repository = factory.create(repositoryMetaData);
401+
Repository repository = factory.create(repositoryMetaData, typesRegistry::get);
402402
repository.start();
403403
return repository;
404404
} catch (Exception e) {

server/src/main/java/org/elasticsearch/repositories/Repository.java

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,13 +28,15 @@
2828
import org.elasticsearch.index.shard.IndexShard;
2929
import org.elasticsearch.index.shard.ShardId;
3030
import org.elasticsearch.index.snapshots.IndexShardSnapshotStatus;
31+
import org.elasticsearch.index.store.Store;
3132
import org.elasticsearch.indices.recovery.RecoveryState;
3233
import org.elasticsearch.snapshots.SnapshotId;
3334
import org.elasticsearch.snapshots.SnapshotInfo;
3435
import org.elasticsearch.snapshots.SnapshotShardFailure;
3536

3637
import java.io.IOException;
3738
import java.util.List;
39+
import java.util.function.Function;
3840

3941
/**
4042
* An interface for interacting with a repository in snapshot and restore.
@@ -46,7 +48,7 @@
4648
* <ul>
4749
* <li>Master calls {@link #initializeSnapshot(SnapshotId, List, org.elasticsearch.cluster.metadata.MetaData)}
4850
* with list of indices that will be included into the snapshot</li>
49-
* <li>Data nodes call {@link Repository#snapshotShard(IndexShard, SnapshotId, IndexId, IndexCommit, IndexShardSnapshotStatus)}
51+
* <li>Data nodes call {@link Repository#snapshotShard(IndexShard, Store, SnapshotId, IndexId, IndexCommit, IndexShardSnapshotStatus)}
5052
* for each shard</li>
5153
* <li>When all shard calls return master calls {@link #finalizeSnapshot} with possible list of failures</li>
5254
* </ul>
@@ -63,6 +65,10 @@ interface Factory {
6365
* @param metadata metadata for the repository including name and settings
6466
*/
6567
Repository create(RepositoryMetaData metadata) throws Exception;
68+
69+
default Repository create(RepositoryMetaData metaData, Function<String, Repository.Factory> typeLookup) throws Exception {
70+
return create(metaData);
71+
}
6672
}
6773

6874
/**
@@ -188,14 +194,15 @@ SnapshotInfo finalizeSnapshot(SnapshotId snapshotId, List<IndexId> indices, long
188194
* <p>
189195
* As snapshot process progresses, implementation of this method should update {@link IndexShardSnapshotStatus} object and check
190196
* {@link IndexShardSnapshotStatus#isAborted()} to see if the snapshot process should be aborted.
191-
*
192197
* @param shard shard to be snapshotted
198+
* @param store store to be snapshotted
193199
* @param snapshotId snapshot id
194200
* @param indexId id for the index being snapshotted
195201
* @param snapshotIndexCommit commit point
196202
* @param snapshotStatus snapshot status
197203
*/
198-
void snapshotShard(IndexShard shard, SnapshotId snapshotId, IndexId indexId, IndexCommit snapshotIndexCommit, IndexShardSnapshotStatus snapshotStatus);
204+
void snapshotShard(IndexShard shard, Store store, SnapshotId snapshotId, IndexId indexId, IndexCommit snapshotIndexCommit,
205+
IndexShardSnapshotStatus snapshotStatus);
199206

200207
/**
201208
* Restores snapshot of the shard.

0 commit comments

Comments
 (0)