-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Ensure authz operation overrides transient authz headers #61621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure authz operation overrides transient authz headers #61621
Conversation
Pinging @elastic/es-security (:Security/Authorization) |
I still need look closer at the tests. But I think the changes overall look good to me. I had a slight concern on performance since |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I left some suggestions. None of them is critical enough to prevent approval.
A bit more discussion that is not directly related to this PR: I am now a bit skeptical about other usages of putTransientIfNonExisting
. I cannot tell any problems. But the fact we discard computed result silently is a bit worring. Maybe we could use a debug logging for when it happens.
.../src/test/java/org/elasticsearch/xpack/security/action/filter/SecurityActionFilterTests.java
Show resolved
Hide resolved
...security/src/main/java/org/elasticsearch/xpack/security/transport/ServerTransportFilter.java
Outdated
Show resolved
Hide resolved
...security/src/main/java/org/elasticsearch/xpack/security/transport/ServerTransportFilter.java
Outdated
Show resolved
Hide resolved
I agree, this is not strictly correct. The approach in this PR is not something that I wholeheartedly endorse. I think it's subtly wrong that only a single authz header takes the value from the latest authz operation, while the other authz headers do not; taken together as a whole, the authz headers do not correctly describe the authorisation outcome. My personal preference would be to stash all the authorisation headers (i.e. including But I'm waiting to see what @jaymode thinks is the best option, as it's now gotten into more of a thread context issue (i.e. the lack of a suitable API) than a security issue. |
Yes. The simple remove method puts responsibility to the caller to backup the original value. Since |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we can add a method to ThreadContext
that also accepts a list of transient names to filter? So something like newStoredContext(boolean, List<String>)
so that we can drop the public removeTransient
method?
As discussed with @jaymode , I've redone it such that, upon restore, ALL the headers are reverted (with the possible exception of response headers). The new method is a The previous implementation, with the partial stash, was confusing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some theoretical concerns. It is possible that they are completely nonsense. But given how important ThreadContext
is, I'll just let them out :)
It's mainly about the two additional authorization headers, ORIGINATING_ACTION
and AUTHORIZATION_INFO
. Compare to INDICES_PERMISSIONS
, they are less specific to the particular request that is being executed.
INDICES_PERMISSIONS
is tightly coupled to the current request. In fact, it is erroneous to reuse it cross requests. This was the problem and we are trying to fix it. But I am not sure the same thing can be said for ORIGINATING_ACTION
and AUTHORIZATION_INFO
.
For ORIGINATING_ACTION
, it seems to make sense that the same value would apply to multiple subsequent requests, i.e. one request causes another. Its value is also checked in AuthorizationUtils#shouldReplaceUserWithSystem
, which I am not sure whether the change would lead to any subtle but important differences.
For AUTHORIZATION_INFO
, the change here makes it mutable across requests. Currently, when the parent action is authorized, the same Role
object is applied to any child actions. This is true even when relevant roles are modified in the middle of the requests. With this PR, it is possible that the Role
object will be different for child actions. If it is then gets denied, not sure this would lead to any inconsistency. With that being said, since tranisents are cleared across nodes, it may not be an issue at all.
Another theoretical thing is about multiple TransportInterceptor
or ActionFilter
. Currently, any extra interceptors/filters run with the parent transients. With this PR, they will run with child transients. If these extra interceptors/filters somehow rely on parent transients to work, it will have issues. Again, this is purely theoretical and is not something to be worried today since no other interceptors/filters check security infos.
server/src/test/java/org/elasticsearch/common/util/concurrent/ThreadContextTests.java
Outdated
Show resolved
Hide resolved
*/ | ||
try (ThreadContext.StoredContext ignore = threadContext.newStoredContext(false, AuthorizationServiceField.ALL_AUTHORIZATION_KEYS)) { | ||
// prior to doing any authorization lets set the originating action in the context only | ||
threadContext.putTransient(AuthorizationServiceField.ORIGINATING_ACTION_KEY, action); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the right semantic? I understand this is the reason why RestSqlSecurityIT
needs to be updated. Technical details aside, if a parent action invokes a child action, should the "originating action" still be the parent action? The change here makes it to be the child action. If it's always the child action, why does it need to be called "originating" action?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just a naming thing. On the obverse, wouldn't it be odd to only just store the parent action name, no matter the child nesting level, but only if the request doesn't cross thee node boundary?
Thanks for reviewing @ywangd .
Yes this is not an issue, IMO. |
This is only theoretical, agreeed, whenever we make use of these transient headers in another place, we unfortunately need to ensure that the value is set correctly; this is the drawback of "global" variables, like thread locals. |
I am puzzled by the comments in
My reading is that the code wants to differentiate the "current action" and "originating action". The logic (simplified) is: If "originating action" is not an internal action, execute the "current action" as system user. So I think the following is a valid scenario since "originating action" is the external_action of step (1):
With changes of this PR, above is no longer valid, since the "originating action" will be "internal_action" at step (2). It will not be a security issue, since the change is more restrictive. But I am not sure whether it could cause subtle failures somewhere. Is it for TransportClient? Or maybe the logic guarantees switching to system user at step (2), which makes step (3) irrelevant. Most of its code and comments are from 2016 and 2017. So I lack the context to fully understand them. I am OK with it if it looks fine with @jaymode |
Sorry for the lack of clarity. This is a ugly piece of the code that I'd love for the team to revisit and see if we can improve it. If my comments below make it clearer maybe the comment should be updated in a separate PR. That said the logic is intended to be:
The following is what we do not want to happen:
Yes a lot of this complexity existed because of the transport client. I think this could be an issue for cases where we do need to switch based on the originating action. @albertzaharovits I suggest leaving the |
Thank you @ywangd and @jaymode ! I must admit that I've just now realised that the method As much as it hurts me to revert to the behaviour of "leaking" of the ORIGINATING_ACTION header across action contexts, I believe this is the correct behaviour 😢 Thank you @ywangd for the vigilance 👍 |
AuthorizationService#authorize uses the thread context to carry the result of the authorisation as transient headers. The listener argument to the `authorize` method must necessarily observe the header values. This PR makes it so that the authorisation transient headers (`_indices_permissions` and `_authz_info`, but NOT `_originating_action_name`) of the child action override the ones of the parent action. Co-authored-by: Tim Vernum [email protected]
AuthorizationService#authorize uses the thread context to carry the result of the authorisation as transient headers. The listener argument to the `authorize` method must necessarily observe the header values. This PR makes it so that the authorisation transient headers (`_indices_permissions` and `_authz_info`, but NOT `_originating_action_name`) of the child action override the ones of the parent action. Co-authored-by: Tim Vernum [email protected]
AuthorizationService#authorize uses the thread context to carry the result of the authorisation as transient headers. The listener argument to the `authorize` method must necessarily observe the header values. This PR makes it so that the authorisation transient headers (`_indices_permissions` and `_authz_info`, but NOT `_originating_action_name`) of the child action override the ones of the parent action. Co-authored-by: Tim Vernum [email protected]
AuthorizationService#authorize uses the thread context to carry the result of the authorisation as transient headers. The listener argument to the `authorize` method must necessarily observe the header values. This PR makes it so that the authorisation transient headers (`_indices_permissions` and `_authz_info`, but NOT `_originating_action_name`) of the child action override the ones of the parent action. Co-authored-by: Tim Vernum [email protected]
AuthorizationService#authorize uses the thread context to carry the result of the authorisation as transient headers. The listener argument to the `authorize` method must necessarily observe the header values. This PR makes it so that the authorisation transient headers (`_indices_permissions` and `_authz_info`, but NOT `_originating_action_name`) of the child action override the ones of the parent action. Co-authored-by: Tim Vernum [email protected]
AuthorizationService#authorize
uses the thread context to carry the result of the authorization as transient headers.The listener argument to the authorize method must necessarily observe the header values.
But we've learned that the
TransportService
carries over the transient headers from the thread context to the locally executed action handlers (unlike the remotely executed action handlers). This becomes problematic when theauthorization
is invoked multiple times, eg. becauseSecurityActionFilter#apply
is effectively invoked in the same thread context when a parent action uses theTransportService
to execute the child action locally.The desired outcome is that the authorization transient headers of the child action supersede the ones of the parent action.
This PR is the first step in this direction; it removes a specific transient header (
AuthorizationServiceField#INDICES_PERMISSIONS_KEY
) before callingAuthorizationService#authorize
which would fill in the header with the new value.Co-authored-by: Tim Vernum [email protected]