Skip to content

Commit 60e0b46

Browse files
Remove Artificially Low Chunk Size Limits from GCS + Azure Blob Stores (#59279)
Removing these limits as they cause unnecessarily many object in the blob stores. We do not have to worry about BwC of this change since we do not support any 3rd party implementations of Azure or GCS. Also, since there is no valid reason to set a different than the default maximum chunk size at this point, removing the documentation (which was incorrect in the case of Azure to begin with) for the setting from the docs. Closes #56018
1 parent 5d6a2dd commit 60e0b46

File tree

6 files changed

+29
-13
lines changed

6 files changed

+29
-13
lines changed

docs/plugins/repository-azure.asciidoc

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ bin/elasticsearch-keystore add azure.client.secondary.sas_token
3030
----------------------------------------------------------------
3131

3232
For more information about these settings, see
33-
<<repository-azure-client-settings>>.
33+
<<repository-azure-client-settings>>.
3434

3535
[IMPORTANT]
3636
.Supported Azure Storage Account types
@@ -99,7 +99,7 @@ stored in the keystore are marked as "secure"; the other settings belong in the
9999

100100
`account` ({ref}/secure-settings.html[Secure], {ref}/secure-settings.html#reloadable-secure-settings[reloadable])::
101101
The Azure account name, which is used by the repository's internal Azure client.
102-
102+
103103
`endpoint_suffix`::
104104
The Azure endpoint suffix to connect to. The default value is
105105
`core.windows.net`.
@@ -164,9 +164,13 @@ The Azure repository supports following settings:
164164

165165
`chunk_size`::
166166

167-
Big files can be broken down into chunks during snapshotting if needed.
167+
Big files can be broken down into multiple smaller blobs in the blob store during snapshotting.
168+
It is not recommended to change this value from its default unless there is an explicit reason for limiting the
169+
size of blobs in the repository. Setting a value lower than the default can result in an increased number of API
170+
calls to the Azure blob store during snapshot create as well as restore operations compared to using the default
171+
value and thus make both operations slower as well as more costly.
168172
Specify the chunk size as a value and unit, for example:
169-
`10MB`, `5KB`, `500B`. Defaults to `64MB` (64MB max).
173+
`10MB`, `5KB`, `500B`. Defaults to the maximum size of a blob in the Azure blob store which is `5TB`.
170174

171175
`compress`::
172176

docs/plugins/repository-gcs.asciidoc

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -228,9 +228,13 @@ The following settings are supported:
228228

229229
`chunk_size`::
230230

231-
Big files can be broken down into chunks during snapshotting if needed.
231+
Big files can be broken down into multiple smaller blobs in the blob store during snapshotting.
232+
It is not recommended to change this value from its default unless there is an explicit reason for limiting the
233+
size of blobs in the repository. Setting a value lower than the default can result in an increased number of API
234+
calls to the Google Cloud Storage Service during snapshot create as well as restore operations compared to using
235+
the default value and thus make both operations slower as well as more costly.
232236
Specify the chunk size as a value and unit, for example:
233-
`10MB` or `5KB`. Defaults to `100MB`, which is the maximum permitted.
237+
`10MB`, `5KB`, `500B`. Defaults to the maximum size of a blob in the Google Cloud Storage Service which is `5TB`.
234238

235239
`compress`::
236240

plugins/repository-azure/src/main/java/org/elasticsearch/repositories/azure/AzureStorageService.java

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
package org.elasticsearch.repositories.azure;
2121

2222
import com.microsoft.azure.storage.CloudStorageAccount;
23+
import com.microsoft.azure.storage.Constants;
2324
import com.microsoft.azure.storage.OperationContext;
2425
import com.microsoft.azure.storage.RetryPolicy;
2526
import com.microsoft.azure.storage.RetryPolicyFactory;
@@ -43,10 +44,11 @@
4344
public class AzureStorageService {
4445

4546
public static final ByteSizeValue MIN_CHUNK_SIZE = new ByteSizeValue(1, ByteSizeUnit.BYTES);
47+
4648
/**
47-
* {@link com.microsoft.azure.storage.blob.BlobConstants#MAX_SINGLE_UPLOAD_BLOB_SIZE_IN_BYTES}
49+
* Maximum allowed blob size in Azure blob store.
4850
*/
49-
public static final ByteSizeValue MAX_CHUNK_SIZE = new ByteSizeValue(256, ByteSizeUnit.MB);
51+
public static final ByteSizeValue MAX_CHUNK_SIZE = new ByteSizeValue(Constants.MAX_BLOB_SIZE, ByteSizeUnit.BYTES);
5052

5153
// 'package' for testing
5254
volatile Map<String, AzureStorageSettings> storageSettings = emptyMap();

plugins/repository-azure/src/test/java/org/elasticsearch/repositories/azure/AzureRepositorySettingsTests.java

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -123,8 +123,9 @@ public void testChunkSize() {
123123

124124
// greater than max chunk size not allowed
125125
e = expectThrows(IllegalArgumentException.class, () ->
126-
azureRepository(Settings.builder().put("chunk_size", "257mb").build()));
127-
assertEquals("failed to parse value [257mb] for setting [chunk_size], must be <= [256mb]", e.getMessage());
126+
azureRepository(Settings.builder().put("chunk_size", "6tb").build()));
127+
assertEquals("failed to parse value [6tb] for setting [chunk_size], must be <= ["
128+
+ AzureStorageService.MAX_CHUNK_SIZE.getStringRep() + "]", e.getMessage());
128129
}
129130

130131
}

plugins/repository-gcs/src/main/java/org/elasticsearch/repositories/gcs/GoogleCloudStorageRepository.java

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,12 @@ class GoogleCloudStorageRepository extends BlobStoreRepository {
4444

4545
// package private for testing
4646
static final ByteSizeValue MIN_CHUNK_SIZE = new ByteSizeValue(1, ByteSizeUnit.BYTES);
47-
static final ByteSizeValue MAX_CHUNK_SIZE = new ByteSizeValue(100, ByteSizeUnit.MB);
47+
48+
/**
49+
* Maximum allowed object size in GCS.
50+
* @see <a href="https://cloud.google.com/storage/quotas#objects">GCS documentation</a> for details.
51+
*/
52+
static final ByteSizeValue MAX_CHUNK_SIZE = new ByteSizeValue(5, ByteSizeUnit.TB);
4853

4954
static final String TYPE = "gcs";
5055

plugins/repository-gcs/src/test/java/org/elasticsearch/repositories/gcs/GoogleCloudStorageBlobStoreRepositoryTests.java

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -169,10 +169,10 @@ public void testChunkSize() {
169169
// greater than max chunk size not allowed
170170
e = expectThrows(IllegalArgumentException.class, () -> {
171171
final RepositoryMetadata repoMetadata = new RepositoryMetadata("repo", GoogleCloudStorageRepository.TYPE,
172-
Settings.builder().put("chunk_size", "101mb").build());
172+
Settings.builder().put("chunk_size", "6tb").build());
173173
GoogleCloudStorageRepository.getSetting(GoogleCloudStorageRepository.CHUNK_SIZE, repoMetadata);
174174
});
175-
assertEquals("failed to parse value [101mb] for setting [chunk_size], must be <= [100mb]", e.getMessage());
175+
assertEquals("failed to parse value [6tb] for setting [chunk_size], must be <= [5tb]", e.getMessage());
176176
}
177177

178178
public void testWriteReadLarge() throws IOException {

0 commit comments

Comments
 (0)