Ensure authz operation overrides transient authz headers #61621

albertzaharovits · 2020-08-27T09:27:04Z

AuthorizationService#authorize uses the thread context to carry the result of the authorization as transient headers.
The listener argument to the authorize method must necessarily observe the header values.

But we've learned that the TransportService carries over the transient headers from the thread context to the locally executed action handlers (unlike the remotely executed action handlers). This becomes problematic when the authorization is invoked multiple times, eg. because SecurityActionFilter#apply is effectively invoked in the same thread context when a parent action uses the TransportService to execute the child action locally.

The desired outcome is that the authorization transient headers of the child action supersede the ones of the parent action.
This PR is the first step in this direction; it removes a specific transient header (AuthorizationServiceField#INDICES_PERMISSIONS_KEY) before calling AuthorizationService#authorize which would fill in the header with the new value.

Co-authored-by: Tim Vernum [email protected]

elasticmachine · 2020-08-27T15:12:44Z

Pinging @elastic/es-security (:Security/Authorization)

ywangd · 2020-08-31T06:59:39Z

I still need look closer at the tests. But I think the changes overall look good to me.

I had a slight concern on performance since IndicesAccessControl is now always discarded before every authorization attempt. But then I realised we are actually already doing it but in an incorrect way in that: IndicesAccessControl is always calculated for every Authorization attempt (so no performance difference) and today the new one is thrown away, which is the source of recent issues. The changes are hence really beneficial since nothing is discarded silently. 👍

ywangd

LGTM. I left some suggestions. None of them is critical enough to prevent approval.

A bit more discussion that is not directly related to this PR: I am now a bit skeptical about other usages of putTransientIfNonExisting. I cannot tell any problems. But the fact we discard computed result silently is a bit worring. Maybe we could use a debug logging for when it happens.

.../src/test/java/org/elasticsearch/xpack/security/action/filter/SecurityActionFilterTests.java

...security/src/main/java/org/elasticsearch/xpack/security/transport/ServerTransportFilter.java

albertzaharovits · 2020-09-01T17:20:50Z

A bit more discussion that is not directly related to this PR: I am now a bit skeptical about other usages of putTransientIfNonExisting. I cannot tell any problems. But the fact we discard computed result silently is a bit worring. Maybe we could use a debug logging for when it happens.

I agree, this is not strictly correct. The approach in this PR is not something that I wholeheartedly endorse. I think it's subtly wrong that only a single authz header takes the value from the latest authz operation, while the other authz headers do not; taken together as a whole, the authz headers do not correctly describe the authorisation outcome.

My personal preference would be to stash all the authorisation headers (i.e. including ORIGINATING_ACTION_KEY and AUTHORIZATION_INFO_KEY) using some form of SecurityContext#stashContext .

But I'm waiting to see what @jaymode thinks is the best option, as it's now gotten into more of a thread context issue (i.e. the lack of a suitable API) than a security issue.

ywangd · 2020-09-01T23:47:05Z

it's now gotten into more of a thread context issue (i.e. the lack of a suitable API) than a security issue.

Yes. The simple remove method puts responsibility to the caller to backup the original value. Since SecurityActionFilter already stores the threadContext, it is not an issue for now. But could be prone to programming error in future usages. It calls for a method that stores current threadContext and optionally remove certain headers in one go.

albertzaharovits · 2020-09-02T22:34:49Z

This PR is now completely changed by introducing the SecurityContext#stashAuthorizationContext method, that stashes the three transient authorization headers. I've confirmed this approach with @jaymode .
Thanks for reviewing @ywangd , but this now obviously needs a new pass.

jaymode

Do you think we can add a method to ThreadContext that also accepts a list of transient names to filter? So something like newStoredContext(boolean, List<String>) so that we can drop the public removeTransient method?

albertzaharovits · 2020-09-08T22:08:58Z

As discussed with @jaymode , I've redone it such that, upon restore, ALL the headers are reverted (with the possible exception of response headers). The new method is a newStoredContext variant that in addition permits clearing up the specified transient headers.

The previous implementation, with the partial stash, was confusing.
The current implementation is simpler but now authorization reverts non-authorization headers as well (but not for the listener). This is only a theoretical difference, as there's nothing left to do (which would require a certain context) after authorization completed and its listener has been called.

jaymode

LGTM. It would be good to also have @ywangd or @tvernum give this a review too

ywangd

I have some theoretical concerns. It is possible that they are completely nonsense. But given how important ThreadContext is, I'll just let them out :)

It's mainly about the two additional authorization headers, ORIGINATING_ACTION and AUTHORIZATION_INFO. Compare to INDICES_PERMISSIONS, they are less specific to the particular request that is being executed.

INDICES_PERMISSIONS is tightly coupled to the current request. In fact, it is erroneous to reuse it cross requests. This was the problem and we are trying to fix it. But I am not sure the same thing can be said for ORIGINATING_ACTION and AUTHORIZATION_INFO.

For ORIGINATING_ACTION, it seems to make sense that the same value would apply to multiple subsequent requests, i.e. one request causes another. Its value is also checked in AuthorizationUtils#shouldReplaceUserWithSystem, which I am not sure whether the change would lead to any subtle but important differences.

For AUTHORIZATION_INFO, the change here makes it mutable across requests. Currently, when the parent action is authorized, the same Role object is applied to any child actions. This is true even when relevant roles are modified in the middle of the requests. With this PR, it is possible that the Role object will be different for child actions. If it is then gets denied, not sure this would lead to any inconsistency. With that being said, since tranisents are cleared across nodes, it may not be an issue at all.

Another theoretical thing is about multiple TransportInterceptor or ActionFilter. Currently, any extra interceptors/filters run with the parent transients. With this PR, they will run with child transients. If these extra interceptors/filters somehow rely on parent transients to work, it will have issues. Again, this is purely theoretical and is not something to be worried today since no other interceptors/filters check security infos.

server/src/test/java/org/elasticsearch/common/util/concurrent/ThreadContextTests.java

ywangd · 2020-09-10T04:15:21Z

...ugin/security/src/main/java/org/elasticsearch/xpack/security/authz/AuthorizationService.java

+         */
+        try (ThreadContext.StoredContext ignore = threadContext.newStoredContext(false, AuthorizationServiceField.ALL_AUTHORIZATION_KEYS)) {
+            // prior to doing any authorization lets set the originating action in the context only
+            threadContext.putTransient(AuthorizationServiceField.ORIGINATING_ACTION_KEY, action);


Is this the right semantic? I understand this is the reason why RestSqlSecurityIT needs to be updated. Technical details aside, if a parent action invokes a child action, should the "originating action" still be the parent action? The change here makes it to be the child action. If it's always the child action, why does it need to be called "originating" action?

It's just a naming thing. On the obverse, wouldn't it be odd to only just store the parent action name, no matter the child nesting level, but only if the request doesn't cross thee node boundary?

albertzaharovits · 2020-09-10T12:05:49Z

Thanks for reviewing @ywangd .

For ORIGINATING_ACTION, it seems to make sense that the same value would apply to multiple subsequent requests, i.e. one request causes another. Its value is also checked in AuthorizationUtils#shouldReplaceUserWithSystem, which I am not sure whether the change would lead to any subtle but important differences.

AuthorizationUtils#shouldReplaceUserWithSystem is the only place where we make use of this headers, (apart from the audit logs). The use case for that condition is when a context is marked as a system context. My assessment is that there would be no difference in behaviour, but it's hard to tell for sure.

For AUTHORIZATION_INFO, the change here makes it mutable across requests. Currently, when the parent action is authorized, the same Role object is applied to any child actions. This is true even when relevant roles are modified in the middle of the requests. With this PR, it is possible that the Role object will be different for child actions. If it is then gets denied, not sure this would lead to any inconsistency. With that being said, since tranisents are cleared across nodes, it may not be an issue at all.

Yes this is not an issue, IMO.

albertzaharovits · 2020-09-10T12:11:26Z

Another theoretical thing is about multiple TransportInterceptor or ActionFilter. Currently, any extra interceptors/filters run with the parent transients. With this PR, they will run with child transients. If these extra interceptors/filters somehow rely on parent transients to work, it will have issues. Again, this is purely theoretical and is not something to be worried today since no other interceptors/filters check security infos.

This is only theoretical, agreeed, whenever we make use of these transient headers in another place, we unfortunately need to ensure that the value is set correctly; this is the drawback of "global" variables, like thread locals.

ywangd · 2020-09-10T13:10:21Z

AuthorizationUtils#shouldReplaceUserWithSystem is the only place where we make use of this headers, (apart from the audit logs). The use case for that condition is when a context is marked as a system context. My assessment is that there would be no difference in behaviour, but it's hard to tell for sure.

I am puzzled by the comments in AuthorizationUtils#shouldReplaceUserWithSystem:

// we have a internal action being executed by a user other than the system user, lets verify that there is a
// originating action that is not a internal action. We verify that there must be a originating action as an
// internal action should never be called by user code from a client

My reading is that the code wants to differentiate the "current action" and "originating action". The logic (simplified) is: If "originating action" is not an internal action, execute the "current action" as system user. So I think the following is a valid scenario since "originating action" is the external_action of step (1):

User invokes external_action (1) -> internal_action (2) -> internal_action (3) executed as system user

With changes of this PR, above is no longer valid, since the "originating action" will be "internal_action" at step (2). It will not be a security issue, since the change is more restrictive. But I am not sure whether it could cause subtle failures somewhere. Is it for TransportClient? Or maybe the logic guarantees switching to system user at step (2), which makes step (3) irrelevant.

Most of its code and comments are from 2016 and 2017. So I lack the context to fully understand them. I am OK with it if it looks fine with @jaymode

jaymode · 2020-09-10T17:12:04Z

I am puzzled by the comments in AuthorizationUtils#shouldReplaceUserWithSystem:

Sorry for the lack of clarity. This is a ugly piece of the code that I'd love for the team to revisit and see if we can improve it. If my comments below make it clearer maybe the comment should be updated in a separate PR. That said the logic is intended to be:

User invokes external_action (1) -> external_action authorized for user to execute -> external_action execution triggers internal_action (2) -> execute internal_action as system

The following is what we do not want to happen:

User invokes internal_action (1) -> execute internal_action as system

Is it for TransportClient?

Yes a lot of this complexity existed because of the transport client.

I think this could be an issue for cases where we do need to switch based on the originating action. @albertzaharovits I suggest leaving the ORIGINATING_ACTION out of the headers that get removed.

albertzaharovits · 2020-09-14T18:17:20Z

I think this could be an issue for cases where we do need to switch based on the originating action. @albertzaharovits I suggest leaving the ORIGINATING_ACTION out of the headers that get removed.

Thank you @ywangd and @jaymode ! I must admit that I've just now realised that the method shouldReplaceeUserWithSystem has both action and originatingAction as two variables...
Until now, every time I looked at the method, I thought they are the same thing.

As much as it hurts me to revert to the behaviour of "leaking" of the ORIGINATING_ACTION header across action contexts, I believe this is the correct behaviour 😢

Thank you @ywangd for the vigilance 👍

AuthorizationService#authorize uses the thread context to carry the result of the authorisation as transient headers. The listener argument to the `authorize` method must necessarily observe the header values. This PR makes it so that the authorisation transient headers (`_indices_permissions` and `_authz_info`, but NOT `_originating_action_name`) of the child action override the ones of the parent action. Co-authored-by: Tim Vernum [email protected]

Done

5e627f7

albertzaharovits added >bug v8.0.0 v7.10.0 labels Aug 27, 2020

albertzaharovits self-assigned this Aug 27, 2020

albertzaharovits added 3 commits August 27, 2020 12:36

Unused import

1939750

Remove indices permissions inn transport filter as well

ba47ac0

Tests fix

5236998

albertzaharovits requested a review from tvernum August 27, 2020 15:12

albertzaharovits added the :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC label Aug 27, 2020

elasticmachine added the Team:Security Meta label for security team label Aug 27, 2020

albertzaharovits requested a review from ywangd August 27, 2020 18:49

ywangd approved these changes Sep 1, 2020

View reviewed changes

albertzaharovits added 5 commits September 2, 2020 19:19

Merge branch 'master' into authz_action_overrides_privs

cdb7ced

Maybe

a92e349

Test midway

ddefebb

Tests done

6a1fdcb

Checkstyle

6eb08ee

albertzaharovits requested review from ywangd and jaymode September 2, 2020 22:34

albertzaharovits changed the title ~~Authz overrides index permission in thread context~~ Ensure authz operation overrides transient authz headers Sep 2, 2020

Checkstyle

3afb0ac

jaymode requested changes Sep 3, 2020

View reviewed changes

albertzaharovits added 2 commits September 4, 2020 10:10

Merge branch 'master' into authz_action_overrides_privs

0777da6

Jay's review but with cruft

64d0404

albertzaharovits requested a review from ywangd September 8, 2020 22:15

albertzaharovits added 2 commits September 9, 2020 15:36

Merge response headers

9dc43fd

Checkstyle

cdc45aa

jaymode approved these changes Sep 9, 2020

View reviewed changes

ywangd reviewed Sep 10, 2020

View reviewed changes

Merge branch 'master' into authz_action_overrides_privs

9d05999

Nit

3bb28ef

albertzaharovits added 3 commits September 14, 2020 18:15

Merge branch 'master' into authz_action_overrides_privs

be4c2f8

Do not override ORIGINATING_ACTION

06ba2e8

Merge branch 'master' into authz_action_overrides_privs

169f680

albertzaharovits merged commit 4b7160d into elastic:master Sep 15, 2020

albertzaharovits deleted the authz_action_overrides_privs branch September 15, 2020 10:55

albertzaharovits mentioned this pull request Sep 15, 2020

BACKPORT Ensure authz operation overrides transient authz headers #62371

Merged

albertzaharovits added the backport pending label Sep 15, 2020

albertzaharovits mentioned this pull request Sep 15, 2020

BACKPORT 7.9 Ensure authz operation overrides transient authz headers (#61621) #62384

Merged

albertzaharovits mentioned this pull request Sep 15, 2020

Ensure authz operation overrides transient authz headers (#61621) #62401

Merged

albertzaharovits removed the backport pending label Sep 15, 2020

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Ensure authz operation overrides transient authz headers #61621

Ensure authz operation overrides transient authz headers #61621

Uh oh!

Conversation

albertzaharovits commented Aug 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticmachine commented Aug 27, 2020

Uh oh!

ywangd commented Aug 31, 2020

Uh oh!

ywangd left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

albertzaharovits commented Sep 1, 2020

Uh oh!

ywangd commented Sep 1, 2020

Uh oh!

albertzaharovits commented Sep 2, 2020

Uh oh!

jaymode left a comment

Choose a reason for hiding this comment

Uh oh!

albertzaharovits commented Sep 8, 2020

Uh oh!

jaymode left a comment

Choose a reason for hiding this comment

Uh oh!

ywangd left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ywangd Sep 10, 2020

Choose a reason for hiding this comment

Uh oh!

albertzaharovits Sep 10, 2020

Choose a reason for hiding this comment

Uh oh!

albertzaharovits commented Sep 10, 2020

Uh oh!

albertzaharovits commented Sep 10, 2020

Uh oh!

ywangd commented Sep 10, 2020

Uh oh!

jaymode commented Sep 10, 2020

Uh oh!

albertzaharovits commented Sep 14, 2020

Uh oh!

Uh oh!

albertzaharovits commented Aug 27, 2020 •

edited

Loading