Service Accounts - Fleet integration #70724

ywangd · 2021-03-23T12:00:16Z

This PR implements rest of the pieces needed for Fleet integration, including

Get service account role descriptor for authorization
API for creating service account token and storing in the security index
API for list tokens for a service account
New named privilege for manage service account
Mandate HTTP TLS for both service account auth and service account related APIs
Tests for API key related operations using service account

PS: I am happy to break this PR into smaller ones according to the dev roadmap if it helps. I raised a single PR because (1) it is helpful to leverage the CI utilities of a PR ; (2) the meat of the changes is only a few classes, e.g. IndexServiceAccountTokenStore. Many of the request/response/action classes are largely boilerplate.

elasticmachine · 2021-03-23T12:00:20Z

Pinging @elastic/es-security (Team:Security)

tvernum

I had a quick look

...va/org/elasticsearch/xpack/core/security/action/service/CreateServiceAccountTokenAction.java

...a/org/elasticsearch/xpack/core/security/action/service/CreateServiceAccountTokenRequest.java

...main/java/org/elasticsearch/xpack/security/authc/service/IndexServiceAccountsTokenStore.java

tvernum · 2021-03-26T03:05:51Z

...main/java/org/elasticsearch/xpack/security/authc/service/IndexServiceAccountsTokenStore.java

+            .field("name", serviceAccountToken.getQualifiedName())
+            .field("creation_time", clock.instant().toEpochMilli())
+            .field("enabled", true)
+            .startObject("creator")


tvernum · 2021-03-26T03:08:37Z

...main/java/org/elasticsearch/xpack/security/authc/service/IndexServiceAccountsTokenStore.java

+            .filter(QueryBuilders.termQuery("doc_type", SERVICE_ACCOUNT_TOKEN_DOC_TYPE))
+            .must(QueryBuilders.prefixQuery("name", accountId.asPrincipal()));
+        final SearchRequest request = client.prepareSearch(SECURITY_MAIN_ALIAS)
+            .setScroll(DEFAULT_KEEPALIVE_SETTING.get(getSettings()))


Do we really want to scroll here, or should we place a limit on how many tokens there can be for a service account?

I was going to say we should replace this and a few other places with the current best practice of "PIT + search_after". I need to check the details, but I think PIT is sharable across different requests. Given we need it in places, e.g. users, application privileges, it is not much overhead to have it here as well. It can be a follow up PR and I can raise an issue for it.
Now, it is also possible to cap it as you suggested. But I am not very convinced because:

It feels arbitrary. Yes it is unlikely to have many service tokens. But it does not inherently says otherwise either.

It introduces extra code complexity, e.g. you need query the current count before creating a new one. There can also be complicated concurrent issues, e.g. two requests at the same time, one tries to create and the other tries to delete. We can potential solve these issues, but the effort is not really justified.

The problem is that scrolls are a limited resource, and we overuse them.
What we really want is semantics that only create a scroll (or PIT etc) if there were more documents than could be returned in a single request, but I don't know of anything like that.

A regular PIT could work, but 99% of the time it's heavier than we need.

I wonder whether this could be a feature request for the search team. It is not possible to be perfect unless directly tackled on the shard level. Anyway, I think this worth to be its own separate task and can wait till the current one is over?

Yes, I think it can wait. But I know @albertzaharovits ran into some cases recently where security was over-using scrolls to the detriment of the overall cluster, so I'm extra conscious of them now.

After discussion, I raised an issue for replacing scroll with PIT + search_after #71032

…security/action/service/CreateServiceAccountTokenAction.java Co-authored-by: Tim Vernum <[email protected]>

ywangd · 2021-03-26T11:42:23Z

Added more comprehensive tests. This PR is now formal.

…et-integration

tvernum · 2021-03-29T05:11:54Z

...va/org/elasticsearch/xpack/core/security/action/service/GetServiceAccountTokensResponse.java

+        builder.endObject().field("file_tokens").startObject();
+        for (TokenInfo info : tokenInfosBySource.getOrDefault(TokenInfo.TokenSource.FILE, List.of())) {
+            info.toXContent(builder, params);
+        }
+        builder.endObject().endObject();


Do we want to output this section if there aren't any? I would have assumed the "file_tokens" field would only exist if there were file tokens.

I think it is more explicit to have an empty list instead of not shown at all, similar to how the roles field in the GetUser response.

...elasticsearch/xpack/core/security/action/service/CreateServiceAccountTokenResponseTests.java

...rg/elasticsearch/xpack/core/security/action/service/GetServiceAccountTokensRequestTests.java

.../main/java/org/elasticsearch/xpack/security/authc/service/FileServiceAccountsTokenStore.java

tvernum · 2021-03-29T05:30:16Z

.../org/elasticsearch/xpack/core/security/action/service/CreateServiceAccountTokenResponse.java

+
+public class CreateServiceAccountTokenResponse extends ActionResponse implements ToXContentObject {
+
+    private final boolean created;


Do we need this field now?

I thought about removing it but decided to keep it in case we need to update a service account token in future. But it is a weak reason and I am happy to remove it.

tvernum · 2021-03-29T05:42:57Z

...main/java/org/elasticsearch/xpack/security/authc/service/IndexServiceAccountsTokenStore.java

+            .filter(QueryBuilders.termQuery("doc_type", SERVICE_ACCOUNT_TOKEN_DOC_TYPE))
+            .must(QueryBuilders.prefixQuery("name", accountId.asPrincipal()));
+        final SearchRequest request = client.prepareSearch(SECURITY_MAIN_ALIAS)
+            .setScroll(DEFAULT_KEEPALIVE_SETTING.get(getSettings()))


The problem is that scrolls are a limited resource, and we overuse them.
What we really want is semantics that only create a scroll (or PIT etc) if there were more documents than could be returned in a single request, but I don't know of anything like that.

A regular PIT could work, but 99% of the time it's heavier than we need.

tvernum · 2021-03-29T05:48:02Z

...n/security/src/main/java/org/elasticsearch/xpack/security/authc/support/TlsRuntimeCheck.java

+    public TlsRuntimeCheck(Settings settings) {
+        this.settings = settings;
+        this.httpTlsEnabled = XPackSettings.HTTP_SSL_ENABLED.get(settings);
+        this.transportTlsEnabled = XPackSettings.TRANSPORT_SSL_ENABLED.get(settings);


I don't think we want to check TransportTLS here.

It requires transport TLS even when in dev mode

It means you need to set transport.ssl.enabled even if using single node discovery.

It duplicates checks that are done elsewhere

Since security already has checks for transport TLS (outside of trial / dev mode) I don't think it's worth duplicating them here.

Makes sense I'll remove it and make it only check the HTTP interface.

When you say "security already has checks for transport TLS", do you mean TLSLicenseBootstrapCheck which errors out if the cluster is in non-trial production? Also I assume the transport TLS check in TransportNodesReloadSecureSettingsAction is redundant as well?

tvernum · 2021-03-29T05:49:16Z

...n/security/src/main/java/org/elasticsearch/xpack/security/authc/support/TlsRuntimeCheck.java

+    }
+
+    public void checkTlsThenExecute(Consumer<Exception> exceptionConsumer, String featureName, Runnable andThen) {
+        if (false == httpTlsEnabled || false == transportTlsEnabled) {


This enforces TLS even when bound to localhost, and if bootstrap checks are off.
I think that's too extreme.

I was thinking to have the localhost check etc as a separate task, i.e. it will be strict first and relaxed before beta or GA. But we can also tackle it now.

Using the current token service bootstrap check as a reference, I am implementing the logic as follows:

If xpack.security.http.ssl.enabled=true, all good.

If security is disabled, skip check (this should not happen, but added for completeness)

If all binding addresses are local, skip check.

If discovery type is "single-node", skip check.

Otherwise throw error.

...rg/elasticsearch/xpack/security/rest/action/service/RestCreateServiceAccountTokenAction.java

Co-authored-by: Tim Vernum <[email protected]>

ywangd · 2021-03-30T10:04:05Z

...curity/src/main/java/org/elasticsearch/xpack/security/authc/support/HttpTlsRuntimeCheck.java

+    public void checkTlsThenExecute(Consumer<Exception> exceptionConsumer, String featureName, Runnable andThen) {
+        // If security is enabled, but TLS is not enabled for the HTTP interface
+        if (securityEnabled && false == httpTlsEnabled) {
+            if (false == initialized.get()) {


An alternative to this initialized AtomicBoolean is org.elasticsearch.common.MemoizedSupplier. But I think AtomicBoolean could be slightly more performant.

ywangd · 2021-03-31T00:49:12Z

@tvernum The rename of elastic/fleet-server service account touched quite a few files and bumped the number of changed files (with rather trivial changes). I hope it does not disturb the review process too much. Please let me know if you have any concerns. Thanks!

…et-integration

tvernum · 2021-03-31T01:15:21Z

I think it would have been better not to do that in this PR - but please don't change it now.

This PR is already too big and touches too many things - if it simply added the index based tokens it could have been merged by now but it's been stalled due to unrelated parts like the TLS requirement.

Omnibus PRs almost always slow things down rather than speeding them up.

tvernum · 2021-03-31T01:42:44Z

...curity/src/main/java/org/elasticsearch/xpack/security/authc/support/HttpTlsRuntimeCheck.java

+                final boolean boundToLocal = Arrays.stream(transport.boundAddress().boundAddresses())
+                    .allMatch(b -> b.address().getAddress().isLoopbackAddress())
+                    && transport.boundAddress().publishAddress().address().getAddress().isLoopbackAddress();
+                this.enforce = false == boundToLocal && false == singleNodeDiscovery;


I hate that we need to duplicate this check all over the place, but I don't think we should solve it here.

You are absolutely right about this. It was long and hard for me to come up with this solution because:

TransportService is not available in createComponents and I realised much later that I can hook into the TransportFactory logic to directly get access to the Transport.

Having it done once and passed everywhere else requires quite a lot of changes including potentially change the createComponents method signature and it will be very ugly for this PR.

I will raise an issue for it.

Raised #71091

tvernum · 2021-03-31T01:45:25Z

...curity/src/main/java/org/elasticsearch/xpack/security/authc/support/HttpTlsRuntimeCheck.java

+                exceptionConsumer.accept(new ElasticsearchException(message.getFormattedMessage()));
+                return;
+            }
+        }


Can you, @jkakavas and I have a chat about this logic and whether it's really what we want before we GA service accounts.

I'm conscious that we're making up the detail of the requirements on the fly, and we ought to be more intentional about it.

Yes we should definitely chat about it. With the security on by default project, we need have a written down spec for how it should behave at runtime.

tvernum · 2021-03-31T01:46:58Z

.../org/elasticsearch/xpack/security/rest/action/service/RestGetServiceAccountTokensAction.java

+    @Override
+    public List<Route> routes() {
+        return List.of(
+            new Route(GET, "/_security/service/{namespace}/{service}/credential")


I know you previously mentioned that you preferred /service_account/ and {service-name}.
Do you still? Should we discuss?

I still slightly prefer /service_account/ over /service/ ({service} is fine). The main reason is that I don't know whether in the future we would need a more accurate usage of /service/, e.g. configures some service within the security domain, say PUT /_security/service/builtin_directory/config. But it may not be an issue at all because it can simply just be PUT /_security/builtin_directory/config. So I think it is a minor personal preference and I am happy to stick to the current design.

This PR implements rest of the pieces needed for Fleet integration, including: * Get service account role descriptor for authorization * API for creating service account token and storing in the security index * API for list tokens for a service account * New named privilege for manage service account * Mandate HTTP TLS for both service account auth and service account related APIs * Tests for API key related operations using service account

* Service Accounts - Initial bootstrap plumbing for essential classes (#70391) This PR is the initial effort to add essential classes for service accounts to lay down the foundation of future works. The classes are wired in places, but not yet been used. Also intentionally left out the actual credential store implementation. It is a good first commit which does not bring in too many changes. * Service Accounts - New CLI tool for managing file tokens (#70454) This is the second PR for service accounts. It adds a new CLI tool elasticsearch-service-tokens to manage file tokens. The file tokens are stored in the service_tokens file under the config directory. Out of the planned create, remove and list sub-commands, this PR only implements the create function since it is the most important one. The other two sub-commands will be handled in separate PRs. * Service Accounts - Authentication with file tokens (#70543) This the 3rd PR for service accounts. It adds support for authentication with file tokens. It also adds a cache for performance so that expensive pbkdf2 hashing does not have to be performed on every request. Adding a cache comes with its own housekeeping work around invalidation. This PR ensures that cache gets invalidated when underlying token file is changed. It does not implement APIs for active invalidation. It will be handled in a separate PR after the API token is in place. * [Test] Service Account - fix test assumption * [Test] Service Accounts - handle token names with leading hyphen (#70983) The CLI tool needs an option terminator (--) for another option names that begin with a hyphen. Otherwise it errors out with message of "not recognized option". The service account token name can begin with a hyphen. Hence we need to use -- when it is the case. An example of equivalent command line is ./bin/elasticsearch-service-tokens create elastic/fleet -- -lead-with-hyphen. * Service Accounts - Fleet integration (#70724) This PR implements rest of the pieces needed for Fleet integration, including: * Get service account role descriptor for authorization * API for creating service account token and storing in the security index * API for list tokens for a service account * New named privilege for manage service account * Mandate HTTP TLS for both service account auth and service account related APIs * Tests for API key related operations using service account * [Test] Service Accounts - Remove colon from invalid token name generator (#71099) The colon character is interpreted as the separate between token name and token secret. So if a token name contains a colon, it is in theory invalid. But the parser takes only the part before the colon as the token name and thus consider it as a valid token name. Subsequent authentication will still fail. But for tests, this generates a different exception and fails the expectation. This PR removes the colon char from being used to generate invalid token names for simplicity. * Fix for 7.x quirks

ywangd added >enhancement :Security/Authentication Logging in, Usernames/passwords, Realms (Native/LDAP/AD/SAML/PKI/etc) v8.0.0 v7.13.0 labels Mar 23, 2021

ywangd requested a review from tvernum March 23, 2021 12:00

elasticmachine added the Team:Security Meta label for security team label Mar 23, 2021

ywangd marked this pull request as draft March 23, 2021 12:35

Service Account - fleet integration

92aea02

ywangd force-pushed the service-account-fleet-integration branch from 7f708f5 to 92aea02 Compare March 26, 2021 02:06

tvernum reviewed Mar 26, 2021

View reviewed changes

ywangd and others added 5 commits March 26, 2021 15:07

[Test] Service Account - fix test assumption

d989ae8

Update x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/…

f47a90a

…security/action/service/CreateServiceAccountTokenAction.java Co-authored-by: Tim Vernum <[email protected]>

Fix tests and remove legacy code

ebd871a

Add more tests

a32bf58

Complete tests

96af0c6

ywangd marked this pull request as ready for review March 26, 2021 11:41

ywangd added 2 commits March 26, 2021 23:00

revert unwanted changes

a0c26ad

fix test

9252cbf

ywangd requested a review from tvernum March 26, 2021 12:48

ywangd added 2 commits March 28, 2021 13:18

extract tls runtime checker

b396716

Merge remote-tracking branch 'origin/master' into service-account-fle…

5b5a3f1

…et-integration

tvernum reviewed Mar 29, 2021

View reviewed changes

ywangd and others added 3 commits March 29, 2021 17:14

Apply suggestions from code review

f2ce8f7

Co-authored-by: Tim Vernum <[email protected]>

address feedback

5b14f19

improve http tls runtime check

f4af54d

ywangd requested a review from tvernum March 29, 2021 12:08

forbidden API

e1c7580

ywangd commented Mar 30, 2021

View reviewed changes

Rename elastic/fleet to elastic/fleet-server

a0fa91c

Merge remote-tracking branch 'origin/master' into service-account-fle…

731de18

…et-integration

tvernum approved these changes Mar 31, 2021

View reviewed changes

ywangd mentioned this pull request Mar 31, 2021

Consolidate the logic for whether enforcing bootstrap or runtime checks #71091

Closed

ywangd merged commit bceb5fb into elastic:master Mar 31, 2021

ywangd added the backport pending label Mar 31, 2021

ywangd mentioned this pull request Apr 8, 2021

Service Accounts - features required for Fleet integration #71514

Merged

ywangd removed the backport pending label Apr 9, 2021

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Service Accounts - Fleet integration #70724

Service Accounts - Fleet integration #70724

ywangd commented Mar 23, 2021 •

edited

Loading

elasticmachine commented Mar 23, 2021

tvernum left a comment

tvernum Mar 26, 2021

tvernum Mar 26, 2021

ywangd Mar 26, 2021

tvernum Mar 29, 2021

ywangd Mar 29, 2021

tvernum Mar 29, 2021

ywangd Mar 30, 2021

ywangd commented Mar 26, 2021

tvernum Mar 29, 2021

ywangd Mar 29, 2021

tvernum Mar 29, 2021

ywangd Mar 29, 2021

tvernum Mar 29, 2021

tvernum Mar 29, 2021

ywangd Mar 29, 2021

ywangd Mar 29, 2021

tvernum Mar 29, 2021

ywangd Mar 29, 2021

ywangd Mar 30, 2021 •

edited

Loading

ywangd commented Mar 31, 2021

tvernum commented Mar 31, 2021

tvernum Mar 31, 2021

ywangd Mar 31, 2021 •

edited

Loading

ywangd Mar 31, 2021

tvernum Mar 31, 2021

ywangd Mar 31, 2021

tvernum Mar 31, 2021

ywangd Mar 31, 2021


		public class CreateServiceAccountTokenResponse extends ActionResponse implements ToXContentObject {

		private final boolean created;

Service Accounts - Fleet integration #70724

Service Accounts - Fleet integration #70724

Conversation

ywangd commented Mar 23, 2021 • edited Loading

elasticmachine commented Mar 23, 2021

tvernum left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ywangd commented Mar 26, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ywangd Mar 30, 2021 • edited Loading

Choose a reason for hiding this comment

ywangd commented Mar 31, 2021

tvernum commented Mar 31, 2021

Choose a reason for hiding this comment

ywangd Mar 31, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ywangd commented Mar 23, 2021 •

edited

Loading

ywangd Mar 30, 2021 •

edited

Loading

ywangd Mar 31, 2021 •

edited

Loading