-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Repository GCS plugin new client library #30168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Repository GCS plugin new client library #30168
Conversation
Pinging @elastic/es-distributed |
Thank you Tanguy for the prompt and diligent review! I have addressed it all. These rogue permissions have been removed:
Apologies, I don't recall what the errors were when they came to be. I haven't however adopted the
Just to check I understand, generally speaking we should avoid writing classes inside foreign packages and use reflection to gain access to hidden members. However, this special case ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @albertzaharovits. I did more research today and I think we can get rid of the magic deletes.size < 3
trick using an appropriate solution that does not require hacks in plugin's code to address integration tests issues. I also did many comments I'd like to see addressed.
There are still two things that worry me and I'd like to do more research about them:
writeBlobMultipart()
loads the whole blob in heap before uploading it. This will produce many allocations and we might want to use the deprecated method based on stream or maybe use BytesStreamOutput... wdyt?- the resumable/multipart code in the fixture looks a bit messy, I'd like to see if we can use UTF8 or plain bytes arrays and mutualize code with the batch request handler
Concerning the MockStorage point I've added my opinion and I don't remember any similar situation before. I don't have a strong feeling so I asked for core/infra team opinion :)
Again, I think this is going in the right direction and thanks a lot for all you work!
" \"token_uri\": \"http://${googleCloudStorageFixture.addressAndPort}/o/oauth2/token\",\n" + | ||
' "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",\n' + | ||
' "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/integration_test%40appspot.gserviceaccount.com"\n' + | ||
' "client_id": "123456789101112130594"\n' + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now the fixture URL is not required anymore, we can remove the dependsOn googleCloudStorageFixture
if (bucket == null) { | ||
return newError(RestStatus.NOT_FOUND, "bucket not found"); | ||
} | ||
if (bucket.objects.put(objectName, EMPTY_BYTE) == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make EMPTY_BYTE final?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, there is some unused code in this class (ctor, handle() method), can you clean this while you are here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done and done
* KEwE3bU4TuyetBgQIghmUw | ||
* --__END_OF_PART__-- | ||
*/ | ||
String boundary = "__END_OF_PART__"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we reuse/mutualize the code with the batch request handler here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have pondered it too, but I think they should be kept separate. The "theoretical" reason is that for the multipart/related
content type the entities are tied together. The first entity contains metadata about the data from the second entity. There is an implied order, and the number of entities is fixed. For the multipart/mixed
all entities are homogeneous.
The "practical" reason is that parsing these is intricate and grouping them together will make them more so.
The only common denominator I see is parsing the headers of each part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's keep as it is.
} | ||
} | ||
// Read line by line ?both? parts of the multipart | ||
try (BufferedReader reader = new BufferedReader(new InputStreamReader(inputStreamBody, StandardCharsets.ISO_8859_1))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I'll look at this because I'm a bit surprised.
// Rewrite Object | ||
// | ||
// https://cloud.google.com/storage/docs/json_api/v1/objects/rewrite | ||
handlers.insert("POST " + endpoint + "/storage/v1/b/{srcBucket}/o/{src}/rewriteTo/b/{destBucket}/o/{dest}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is identical to the copyTo
handler, maybe we could use a path pattern like:
/storage/v1/b/{srcBucket}/o/{src}/{action}/b/{destBucket}/o/{dest}
And use the value of {action}
to differentiate the response?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have done so although, it looks like copyTo
has fallen into disgrace in favor of rewriteTo
but I cannot guarantee it's not used anymore, or that it will not be used again in a minor version update of the lib.
return null; | ||
}); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this cleanup!
* @return a Client instance that can be used to manage Storage objects | ||
*/ | ||
public Storage createClient(final String clientName) throws Exception { | ||
public Storage createClient(final String clientName) throws GeneralSecurityException, IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are OK to only declare Exception instead of declaring all the possible thrown exceptions if they do not require any special handling by the callers.
Storage.Builder storage = new Storage.Builder(transport, JacksonFactory.getDefaultInstance(), requestInitializer); | ||
if (Strings.hasLength(clientSettings.getApplicationName())) { | ||
storage.setApplicationName(clientSettings.getApplicationName()); | ||
final NetHttpTransport netHttpTransport = GoogleNetHttpTransport.newTrustedTransport(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could workaround the Google SDK issue concerning non propagated custom endpoint by doing something like:
final HttpTransport httpTransport = createHttpTransport(clientSettings.getHost());
and add a static method like the following which is in charge of checking and rewriting URLS that do not point to the right endpoint:
static HttpTransport createHttpTransport(final String endpoint) throws Exception {
final NetHttpTransport.Builder builder = new NetHttpTransport.Builder();
builder.trustCertificates(GoogleUtils.getCertificateTrustStore());
if (Strings.hasLength(endpoint)) {
final URL endpointUrl = URI.create(endpoint).toURL();
builder.setConnectionFactory(new DefaultConnectionFactory() {
@Override
public HttpURLConnection openConnection(final URL originalUrl) throws IOException {
URL url = originalUrl;
if ((originalUrl.getHost().equals(endpointUrl.getHost()) == false)
|| (originalUrl.getPort() != endpointUrl.getPort())
|| (originalUrl.getProtocol().equals(endpointUrl.getProtocol())) == false) {
url = new URL(endpointUrl.getProtocol(), endpointUrl.getHost(), endpointUrl.getPort(), originalUrl.getFile());
}
return super.openConnection(url);
}
});
}
return builder.build();
}
This way, production code is not forced to have a magic trick like deletes.size() < 3
. Also, production code should almost always use the default connection factory provided by the Google SDK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have coded smth similar. I have not copied it verbatim because URL#getFile
is a forbiddenApi and a URI conversion was required.
private static HttpTransport createHttpTransport(final String endpoint) throws Exception {
final NetHttpTransport.Builder builder = new NetHttpTransport.Builder();
// requires java.lang.RuntimePermission "setFactory"
builder.trustCertificates(GoogleUtils.getCertificateTrustStore());
if (Strings.hasLength(endpoint)) {
final URL endpointUrl = URI.create(endpoint).toURL();
builder.setConnectionFactory(new DefaultConnectionFactory() {
@Override
public HttpURLConnection openConnection(final URL originalUrl) throws IOException {
// test if the URL is built correctly, ie following the `host` setting
if (originalUrl.getHost().equals(endpointUrl.getHost()) && originalUrl.getPort() == endpointUrl.getPort()
&& originalUrl.getProtocol().equals(endpointUrl.getProtocol())) {
return super.openConnection(originalUrl);
}
// override connection URLs because some don't follow the config. See
// https://github.com/GoogleCloudPlatform/google-cloud-java/issues/3254 and
// https://github.com/GoogleCloudPlatform/google-cloud-java/issues/3255
URI originalUri;
try {
originalUri = originalUrl.toURI();
} catch (final URISyntaxException e) {
throw new RuntimeException(e);
}
String overridePath = "/";
if (originalUri.getRawPath() != null) {
overridePath = originalUri.getRawPath();
}
if (originalUri.getRawQuery() != null) {
overridePath += "?" + originalUri.getRawQuery();
}
return super.openConnection(
new URL(endpointUrl.getProtocol(), endpointUrl.getHost(), endpointUrl.getPort(), overridePath));
}
});
}
return builder.build();
}
} | ||
}); | ||
|
||
// Insert Object (upload) | ||
// | ||
// https://cloud.google.com/storage/docs/json_api/v1/how-tos/resumable-upload | ||
handlers.insert("PUT " + endpoint + "/upload/storage/v1/b/{bucket}/o", (params, headers, body) -> { | ||
String objectId = params.get("upload_id"); | ||
final String objectId = params.get("upload_id"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need this handler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is still required when resumable upload is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
optionMethod.setAccessible(true); | ||
assert StorageRpc.Option.PREFIX.equals(optionMethod.invoke(prefixFilter)) : "Only the prefix filter is mocked"; | ||
final Method valueMethod = BlobListOption.class.getSuperclass().getDeclaredMethod("getValue"); | ||
valueMethod.setAccessible(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@albertzaharovits and I talked about this, and we figured out that this is not the only place where the permission is needed but the main code requires it too.
I still think that we should not add a specific permission for tests in this case because it can hide other places where a security exception could be thrown where in fact we should work to remove this permission in the main plugin. I also think that in this case Mockito makes things harder to maintain and to read where we can instead use POJOs to mock the Google Storage but with this option we have to pay the price of adding two utility classed in foreign packages like com.google.storage
. It's like to choose between two bad options but this is an interesting point and I'd like to know @elastic/es-core-infra opinion
@tlrx I have profiled the _snapshot API with w/ & w/o the multipart update trick. I have attached the flight recordings (profile data, in unpretentious lingo). Curious if you spot something on the profiling data! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @albertzaharovits for going through this change!
I left some minor comments I'd like to be addressed before merging but they do not require another review.
docs/plugins/repository-gcs.asciidoc
Outdated
`project_id`:: | ||
|
||
The Google Cloud project id. This will be automatically infered from the credentials file but | ||
can be specified explicitly. For example, it can be used to switch between projects when the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: extra space
plugins/repository-gcs/build.gradle
Outdated
@@ -17,43 +17,214 @@ | |||
* under the License. | |||
*/ | |||
|
|||
import org.elasticsearch.gradle.test.AntFixture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: can you remove this import? It seems to be unused
if (bucket == null) { | ||
return newError(RestStatus.NOT_FOUND, "bucket not found"); | ||
} | ||
if (bucket.objects.put(objectName, EMPTY_BYTE) == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change this to putIfAbsent()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch
* KEwE3bU4TuyetBgQIghmUw | ||
* --__END_OF_PART__-- | ||
*/ | ||
String boundary = "__END_OF_PART__"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's keep as it is.
} | ||
}); | ||
|
||
// Insert Object (upload) | ||
// | ||
// https://cloud.google.com/storage/docs/json_api/v1/how-tos/resumable-upload | ||
handlers.insert("PUT " + endpoint + "/upload/storage/v1/b/{bucket}/o", (params, headers, body) -> { | ||
String objectId = params.get("upload_id"); | ||
final String objectId = params.get("upload_id"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
||
/** An override for the Storage endpoint to connect to. */ | ||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: use the same doc formatting as the other settings
@@ -66,14 +80,24 @@ | |||
|
|||
/** Name used by the client when it uses the Google Cloud JSON API. **/ | |||
static final Setting.AffixSetting<String> APPLICATION_NAME_SETTING = Setting.affixKeySetting(PREFIX, "application_name", | |||
key -> new Setting<>(key, "repository-gcs", s -> s, Setting.Property.NodeScope)); | |||
key -> new Setting<>(key, "repository-gcs", Function.identity(), Setting.Property.NodeScope, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you revert this identation?
/** | ||
* The Storage endpoint URL the client should talk to, or null string to use the | ||
* default | ||
**/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: same doc formating?
return endpoint; | ||
} | ||
|
||
public String getProjectId() { | ||
return Strings.hasLength(projectId) ? projectId : (credential != null ? credential.getProjectId() : null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch
@@ -121,6 +136,14 @@ private static GoogleCloudStorageClientSettings randomClient(final String client | |||
endpoint = ENDPOINT_SETTING.getDefault(Settings.EMPTY); | |||
} | |||
|
|||
String projectId; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the endpoint
setting should be randomized between valid URLs like http://www.elastic.co
, https://www.elastic.co:443
, https://www.googleapis.com
etc
…ngs-to-true * elastic/master: [DOCS] Restores 7.0.0 release notes and highlights Remove assert statements from field caps documentation. (elastic#30601) Repository GCS plugin new client library (elastic#30168) HLRestClient: Follow-up for put index template api (elastic#30592) Unmute IndexUpgradeIT tests [DOCS] Remove references to changelog and to highlights Side-step pending deletes check (elastic#30571) [DOCS] Remove references to removed changelog Revert "Mute ML upgrade test (elastic#30458)" [ML] Adjust BWC version following backport of elastic#30125 [Docs] Improve section detailing translog usage (elastic#30573) [Tests] Relax allowed delta in extended_stats aggregation (elastic#30569) [ML] Reverse engineer Grok patterns from categorization results (elastic#30125) Update build file due to doc file rename Remove the changelog (elastic#30593) Fix issue with finishing handshake in ssl driver (elastic#30580) Fail if reading from closed KeyStoreWrapper (elastic#30394) Silence IndexUpgradeIT test failures. (elastic#30430)
This does away with the deprecated `com.google.api-client:google-api-client:1.23` and replaces it with `com.google.cloud:google-cloud-storage:1.28.0`. It also changes security permissions for the repository-gcs plugin. Closes: #29259
I looked at the profiling results from @albertzaharovits and I didn't see any strong differences except that the tests using only the resumable upload method show more socket I/O, which is expected because this method is supposed to execute more requests. I didn't see any differences in total execution time when snapshotting a large index using resumable and resumable+multipart using the "new" client. I also compared the snapshot execution time with the "old" client and this "new" client and didn't see any performance change. Of course, we should have real benchmarks for this. |
This does away with the
com.google.api-client:google-api-client:1.23
and replaces it withcom.google.cloud:google-cloud-storage:1.27.0
. It also changes security permissions for this plugin.The API of the new library is indeed easier to use, but the configuration requires uncovering internals which most likely will change in the new versions (because the internals are actually the old library that will go away). This should be easy to change when it will happen, although some configuration options might go away (eg
applicationName
orprojectId
)DISCLAIMER: This is not turnkey ready. The REST integ test
./gradlew :plugins:repository-gcs:integTest
does not work so I have tested the plugin manually with my Google Cloud account. I have tested it using the Service Account JSON file.The reason the integration test does not work is that the Service Account JSON file, that we contrive to direct the cluster to the mock cloud, does not allow to override the
oauth/token
endpoint, which is the first endpoint called in the flow. So, even if the subsequent API calls should be compatible with our existing mock, to obtain the token, there is a call to 'googleapis' which obviously gets 'unauthorized'. I don't think this is intended, just a sloppiness in the transition.Another worry is that the Application Default Credentials used to authenticate when ES runs on a VM provisioned inside google cloud, reads files from various places (ENV vars, user home, ... other heuristics), so a snap-in
security-permission.policy
is not an option. We could mimic what we do with our config dir, but in this case we would have to duplicate their credentials search logic.Closes: #29259