Skip to content

Repository GCS plugin new client library #30168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 60 commits into from
May 15, 2018

Conversation

albertzaharovits
Copy link
Contributor

This does away with the com.google.api-client:google-api-client:1.23 and replaces it with com.google.cloud:google-cloud-storage:1.27.0 . It also changes security permissions for this plugin.

The API of the new library is indeed easier to use, but the configuration requires uncovering internals which most likely will change in the new versions (because the internals are actually the old library that will go away). This should be easy to change when it will happen, although some configuration options might go away (eg applicationName or projectId)

DISCLAIMER: This is not turnkey ready. The REST integ test ./gradlew :plugins:repository-gcs:integTest does not work so I have tested the plugin manually with my Google Cloud account. I have tested it using the Service Account JSON file.

The reason the integration test does not work is that the Service Account JSON file, that we contrive to direct the cluster to the mock cloud, does not allow to override the oauth/token endpoint, which is the first endpoint called in the flow. So, even if the subsequent API calls should be compatible with our existing mock, to obtain the token, there is a call to 'googleapis' which obviously gets 'unauthorized'. I don't think this is intended, just a sloppiness in the transition.

Another worry is that the Application Default Credentials used to authenticate when ES runs on a VM provisioned inside google cloud, reads files from various places (ENV vars, user home, ... other heuristics), so a snap-in security-permission.policy is not an option. We could mimic what we do with our config dir, but in this case we would have to duplicate their credentials search logic.

Closes: #29259

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@albertzaharovits
Copy link
Contributor Author

Thank you Tanguy for the prompt and diligent review! I have addressed it all.

These rogue permissions have been removed:

permission java.lang.RuntimePermission "accessClassInPackage.sun.misc";
permission java.net.URLPermission "https://www.googleapis.com/-", "*:*";
permission java.net.URLPermission "https://accounts.google.com/-", "*:*";

Apologies, I don't recall what the errors were when they came to be.

I haven't however adopted the MockStorage you propose:

I think we can get rid of the specific permission by using a different implementation of MockStorage > that does not use mockito and does not require the additionnal permission.
I implemented it here: tlrx/elasticsearch@1f40ea5, I think this is how MockStorage should be.

Just to check I understand, generally speaking we should avoid writing classes inside foreign packages and use reflection to gain access to hidden members. However, this special case (com.google.cloud.storage.StorageRpcOptionUtils) was mandated by the goal to get rid of the suppressAccessChecks permission, assuming the permission was required only by the test code. Is this correct? If the permission is required by the src/main code, and therefore cannot be removed, are classes written outside the org.elasticsearch considered a "code smell"?

Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @albertzaharovits. I did more research today and I think we can get rid of the magic deletes.size < 3 trick using an appropriate solution that does not require hacks in plugin's code to address integration tests issues. I also did many comments I'd like to see addressed.

There are still two things that worry me and I'd like to do more research about them:

  • writeBlobMultipart() loads the whole blob in heap before uploading it. This will produce many allocations and we might want to use the deprecated method based on stream or maybe use BytesStreamOutput... wdyt?
  • the resumable/multipart code in the fixture looks a bit messy, I'd like to see if we can use UTF8 or plain bytes arrays and mutualize code with the batch request handler

Concerning the MockStorage point I've added my opinion and I don't remember any similar situation before. I don't have a strong feeling so I asked for core/infra team opinion :)

Again, I think this is going in the right direction and thanks a lot for all you work!

" \"token_uri\": \"http://${googleCloudStorageFixture.addressAndPort}/o/oauth2/token\",\n" +
' "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",\n' +
' "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/integration_test%40appspot.gserviceaccount.com"\n' +
' "client_id": "123456789101112130594"\n' +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the fixture URL is not required anymore, we can remove the dependsOn googleCloudStorageFixture

if (bucket == null) {
return newError(RestStatus.NOT_FOUND, "bucket not found");
}
if (bucket.objects.put(objectName, EMPTY_BYTE) == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make EMPTY_BYTE final?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, there is some unused code in this class (ctor, handle() method), can you clean this while you are here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done and done

* KEwE3bU4TuyetBgQIghmUw
* --__END_OF_PART__--
*/
String boundary = "__END_OF_PART__";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we reuse/mutualize the code with the batch request handler here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have pondered it too, but I think they should be kept separate. The "theoretical" reason is that for the multipart/related content type the entities are tied together. The first entity contains metadata about the data from the second entity. There is an implied order, and the number of entities is fixed. For the multipart/mixed all entities are homogeneous.
The "practical" reason is that parsing these is intricate and grouping them together will make them more so.

The only common denominator I see is parsing the headers of each part.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's keep as it is.

}
}
// Read line by line ?both? parts of the multipart
try (BufferedReader reader = new BufferedReader(new InputStreamReader(inputStreamBody, StandardCharsets.ISO_8859_1))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I'll look at this because I'm a bit surprised.

// Rewrite Object
//
// https://cloud.google.com/storage/docs/json_api/v1/objects/rewrite
handlers.insert("POST " + endpoint + "/storage/v1/b/{srcBucket}/o/{src}/rewriteTo/b/{destBucket}/o/{dest}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is identical to the copyTo handler, maybe we could use a path pattern like:
/storage/v1/b/{srcBucket}/o/{src}/{action}/b/{destBucket}/o/{dest}

And use the value of {action} to differentiate the response?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done so although, it looks like copyTo has fallen into disgrace in favor of rewriteTo but I cannot guarantee it's not used anymore, or that it will not be used again in a minor version update of the lib.

return null;
});
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this cleanup!

* @return a Client instance that can be used to manage Storage objects
*/
public Storage createClient(final String clientName) throws Exception {
public Storage createClient(final String clientName) throws GeneralSecurityException, IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are OK to only declare Exception instead of declaring all the possible thrown exceptions if they do not require any special handling by the callers.

Storage.Builder storage = new Storage.Builder(transport, JacksonFactory.getDefaultInstance(), requestInitializer);
if (Strings.hasLength(clientSettings.getApplicationName())) {
storage.setApplicationName(clientSettings.getApplicationName());
final NetHttpTransport netHttpTransport = GoogleNetHttpTransport.newTrustedTransport();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could workaround the Google SDK issue concerning non propagated custom endpoint by doing something like:

final HttpTransport httpTransport = createHttpTransport(clientSettings.getHost());

and add a static method like the following which is in charge of checking and rewriting URLS that do not point to the right endpoint:

static HttpTransport createHttpTransport(final String endpoint) throws Exception {
        final NetHttpTransport.Builder builder = new NetHttpTransport.Builder();
        builder.trustCertificates(GoogleUtils.getCertificateTrustStore());
        if (Strings.hasLength(endpoint)) {
            final URL endpointUrl = URI.create(endpoint).toURL();
            builder.setConnectionFactory(new DefaultConnectionFactory() {
                @Override
                public HttpURLConnection openConnection(final URL originalUrl) throws IOException {
                    URL url = originalUrl;
                    if ((originalUrl.getHost().equals(endpointUrl.getHost()) == false)
                        || (originalUrl.getPort() != endpointUrl.getPort())
                        || (originalUrl.getProtocol().equals(endpointUrl.getProtocol())) == false) {
                        url = new URL(endpointUrl.getProtocol(), endpointUrl.getHost(), endpointUrl.getPort(), originalUrl.getFile());
                    }
                    return super.openConnection(url);
                }
            });
        }
        return builder.build();
    }

This way, production code is not forced to have a magic trick like deletes.size() < 3. Also, production code should almost always use the default connection factory provided by the Google SDK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have coded smth similar. I have not copied it verbatim because URL#getFile is a forbiddenApi and a URI conversion was required.

    private static HttpTransport createHttpTransport(final String endpoint) throws Exception {
        final NetHttpTransport.Builder builder = new NetHttpTransport.Builder();
        // requires java.lang.RuntimePermission "setFactory"
        builder.trustCertificates(GoogleUtils.getCertificateTrustStore());
        if (Strings.hasLength(endpoint)) {
            final URL endpointUrl = URI.create(endpoint).toURL();
            builder.setConnectionFactory(new DefaultConnectionFactory() {
                @Override
                public HttpURLConnection openConnection(final URL originalUrl) throws IOException {
                    // test if the URL is built correctly, ie following the `host` setting
                    if (originalUrl.getHost().equals(endpointUrl.getHost()) && originalUrl.getPort() == endpointUrl.getPort()
                            && originalUrl.getProtocol().equals(endpointUrl.getProtocol())) {
                        return super.openConnection(originalUrl);
                    }
                    // override connection URLs because some don't follow the config. See
                    // https://github.com/GoogleCloudPlatform/google-cloud-java/issues/3254 and
                    // https://github.com/GoogleCloudPlatform/google-cloud-java/issues/3255
                    URI originalUri;
                    try {
                        originalUri = originalUrl.toURI();
                    } catch (final URISyntaxException e) {
                        throw new RuntimeException(e);
                    }
                    String overridePath = "/";
                    if (originalUri.getRawPath() != null) {
                        overridePath = originalUri.getRawPath();
                    }
                    if (originalUri.getRawQuery() != null) {
                        overridePath += "?" + originalUri.getRawQuery();
                    }
                    return super.openConnection(
                            new URL(endpointUrl.getProtocol(), endpointUrl.getHost(), endpointUrl.getPort(), overridePath));
                }
            });
        }
        return builder.build();
    }

}
});

// Insert Object (upload)
//
// https://cloud.google.com/storage/docs/json_api/v1/how-tos/resumable-upload
handlers.insert("PUT " + endpoint + "/upload/storage/v1/b/{bucket}/o", (params, headers, body) -> {
String objectId = params.get("upload_id");
final String objectId = params.get("upload_id");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this handler?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is still required when resumable upload is used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

optionMethod.setAccessible(true);
assert StorageRpc.Option.PREFIX.equals(optionMethod.invoke(prefixFilter)) : "Only the prefix filter is mocked";
final Method valueMethod = BlobListOption.class.getSuperclass().getDeclaredMethod("getValue");
valueMethod.setAccessible(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@albertzaharovits and I talked about this, and we figured out that this is not the only place where the permission is needed but the main code requires it too.

I still think that we should not add a specific permission for tests in this case because it can hide other places where a security exception could be thrown where in fact we should work to remove this permission in the main plugin. I also think that in this case Mockito makes things harder to maintain and to read where we can instead use POJOs to mock the Google Storage but with this option we have to pay the price of adding two utility classed in foreign packages like com.google.storage. It's like to choose between two bad options but this is an interesting point and I'd like to know @elastic/es-core-infra opinion

@albertzaharovits
Copy link
Contributor Author

@tlrx I have profiled the _snapshot API with w/ & w/o the multipart update trick. I have attached the flight recordings (profile data, in unpretentious lingo).
resumable & mutipart:
https://drive.google.com/file/d/1KZc4Ua2r8oAttrSMvso7oTD8EirPV1_1/view?usp=sharing
https://drive.google.com/file/d/1wDuTTnqtDYdpZ1TJr3j1sWTaT9BOMnDQ/view?usp=sharing
resumable only:
https://drive.google.com/file/d/1Va_iTVwFrP99Iy96pMGITJrfwxGdxP2a/view?usp=sharing
https://drive.google.com/file/d/1krSQPOiJnpLc4b3bTbXXYHQ3Q3GxtA8n/view?usp=sharing
The VM is looping PUT _snapshot and DELETE _snapshot for the geonames index (from the rally track) with true GCS credentials. I cannot eye any difference between thee two methods. It might be that the bottleneck is outside the VM.
I have collected the timings too, but they seem un-illustrative, they have variance and it does not appear to be any statistical difference between the snapshot times. I can share them if the profile data is not enough.

Curious if you spot something on the profiling data!

Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @albertzaharovits for going through this change!

I left some minor comments I'd like to be addressed before merging but they do not require another review.

`project_id`::

The Google Cloud project id. This will be automatically infered from the credentials file but
can be specified explicitly. For example, it can be used to switch between projects when the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: extra space

@@ -17,43 +17,214 @@
* under the License.
*/

import org.elasticsearch.gradle.test.AntFixture
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can you remove this import? It seems to be unused

if (bucket == null) {
return newError(RestStatus.NOT_FOUND, "bucket not found");
}
if (bucket.objects.put(objectName, EMPTY_BYTE) == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change this to putIfAbsent()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch

* KEwE3bU4TuyetBgQIghmUw
* --__END_OF_PART__--
*/
String boundary = "__END_OF_PART__";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's keep as it is.

}
});

// Insert Object (upload)
//
// https://cloud.google.com/storage/docs/json_api/v1/how-tos/resumable-upload
handlers.insert("PUT " + endpoint + "/upload/storage/v1/b/{bucket}/o", (params, headers, body) -> {
String objectId = params.get("upload_id");
final String objectId = params.get("upload_id");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok


/** An override for the Storage endpoint to connect to. */
/**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: use the same doc formatting as the other settings

@@ -66,14 +80,24 @@

/** Name used by the client when it uses the Google Cloud JSON API. **/
static final Setting.AffixSetting<String> APPLICATION_NAME_SETTING = Setting.affixKeySetting(PREFIX, "application_name",
key -> new Setting<>(key, "repository-gcs", s -> s, Setting.Property.NodeScope));
key -> new Setting<>(key, "repository-gcs", Function.identity(), Setting.Property.NodeScope,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you revert this identation?

/**
* The Storage endpoint URL the client should talk to, or null string to use the
* default
**/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: same doc formating?

return endpoint;
}

public String getProjectId() {
return Strings.hasLength(projectId) ? projectId : (credential != null ? credential.getProjectId() : null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch

@@ -121,6 +136,14 @@ private static GoogleCloudStorageClientSettings randomClient(final String client
endpoint = ENDPOINT_SETTING.getDefault(Settings.EMPTY);
}

String projectId;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the endpoint setting should be randomized between valid URLs like http://www.elastic.co, https://www.elastic.co:443, https://www.googleapis.com etc

@albertzaharovits albertzaharovits merged commit 801973f into elastic:master May 15, 2018
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request May 15, 2018
…ngs-to-true

* elastic/master:
  [DOCS] Restores 7.0.0 release notes and highlights
  Remove assert statements from field caps documentation. (elastic#30601)
  Repository GCS plugin new client library (elastic#30168)
  HLRestClient: Follow-up for put index template api (elastic#30592)
  Unmute IndexUpgradeIT tests
  [DOCS] Remove references to changelog and to highlights
  Side-step pending deletes check (elastic#30571)
  [DOCS] Remove references to removed changelog
  Revert "Mute ML upgrade test (elastic#30458)"
  [ML] Adjust BWC version following backport of elastic#30125
  [Docs] Improve section detailing translog usage (elastic#30573)
  [Tests] Relax allowed delta in extended_stats aggregation (elastic#30569)
  [ML] Reverse engineer Grok patterns from categorization results (elastic#30125)
  Update build file due to doc file rename
  Remove the changelog (elastic#30593)
  Fix issue with finishing handshake in ssl driver (elastic#30580)
  Fail if reading from closed KeyStoreWrapper (elastic#30394)
  Silence IndexUpgradeIT test failures. (elastic#30430)
albertzaharovits added a commit that referenced this pull request May 15, 2018
This does away with the deprecated `com.google.api-client:google-api-client:1.23`
and replaces it with `com.google.cloud:google-cloud-storage:1.28.0`.
It also changes security permissions for the repository-gcs plugin.

Closes: #29259
@tlrx
Copy link
Member

tlrx commented May 16, 2018

I looked at the profiling results from @albertzaharovits and I didn't see any strong differences except that the tests using only the resumable upload method show more socket I/O, which is expected because this method is supposed to execute more requests. I didn't see any differences in total execution time when snapshotting a large index using resumable and resumable+multipart using the "new" client.

I also compared the snapshot execution time with the "old" client and this "new" client and didn't see any performance change.

Of course, we should have real benchmarks for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants