Skip to content

Allow users to get status of own async search tasks #106638

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 28, 2024

Conversation

tvernum
Copy link
Contributor

@tvernum tvernum commented Mar 22, 2024

This consists of 3 changes:

  1. Refactoring the code so that all the security logic in the async search code is moved to AsyncSearchSecurity
  2. Changing TransportGetAsyncStatusAction to check for ownership if the user does not have explicit access to the GetAsyncStatusAction (if they have such access it means that they can get the status of all async searches)
  3. In RBACEngine, if a user does not have permission to GetAsyncStatusAction but does have permission to submit async searches, then let them run the action, relying on point 2 above.

This commit moves code from AsyncTaskIndexService and
DeleteAsyncResultsService into a new AsyncSearchSecurity so that all
security code is centralised.,
@tvernum tvernum changed the title Allow users to get status of own asyc search tasks Allow users to get status of own async search tasks Mar 22, 2024
This changs RBAC engine so users with access to submit async searches
also have permission to retrive async search status. It relies on the
fact that async search will check that the user owns the async search
task
@quux00
Copy link
Contributor

quux00 commented Mar 22, 2024

Manual testing results, comparing code on main to this branch.

Summary of testing results

I found two changes in this branch compared to main:

  • First, the core goal of this ticket was achieved. A user with only index:read permissions (and not cluster:monitor) can now access the _async_search/status endpoint for their own searches.

  • Second, the error a user with only index:read permissions (and not cluster:monitor) gets when trying to see the status of another users search has changed. In main it is 403 Forbidden. In this branch it is 404 NotFound. This could be considered a regression/breaking change?

Details

I created an async-search with the user mp_user, having only read permissions, but not cluster:monitor. The I queried the _async_search/status endpoint with 3 different users:

|---------+-----------------------------|
| role    | privileges                  |
|---------+-----------------------------|
| mp_role | cluster:none, index:read    |
| yy_role | cluster:monitor, index:read |
|---------+-----------------------------|
|---------+---------|
| user    | role    |
|---------+---------|
| mp_user | mp_role |
| xx_user | mp_role |
|---------+---------|
| yy_user | yy_role |
|---------+---------|

The overall test was run twice on each branch to ensure I capture the data correctly. Test outcome:

|-------------------------------------------+---------------+---------+------------------+--------------+---------------|
| action                                    | search status | user    | behavior in main | Tim's PR     | comments      |
|-------------------------------------------+---------------+---------+------------------+--------------+---------------|
| create async-search                       | N/A           | mp_user | works            | works        |               |
|-------------------------------------------+---------------+---------+------------------+--------------+---------------|
| GET async-search/:id                      | running       | mp_user | 200 OK           | 200 OK       |               |
| GET async-search/status/:id               | running       | mp_user | 403 Forbidden    | 200 OK       | Fixed!        |
| GET async-search/status/:id&keep_alive=1m | running       | mp_user | 403 Forbidden    | 200 OK       | Fixed!        |
|-------------------------------------------+---------------+---------+------------------+--------------+---------------|
| GET async-search/:id                      | completed     | mp_user | 200 OK           | 200 OK       |               |
| GET async-search/status/:id               | completed     | mp_user | 403 Forbidden    | 200 OK       | Fixed!        |
| GET async-search/status/:id&keep_alive=1m | completed     | mp_user | 403 Forbidden    | 200 OK       | Fixed!        |
|-------------------------------------------+---------------+---------+------------------+--------------+---------------|
| GET async-search/:id                      | running       | xx_user | 404 NotFound     | 404 NotFound |               |
| GET async-search/status/:id               | running       | xx_user | 403 Forbidden    | 404 NotFound | Regression?   |
| GET async-search/status/:id&keep_alive=1m | running       | xx_user | 403 Forbidden    | 404 NotFound | Regression?   |
|-------------------------------------------+---------------+---------+------------------+--------------+---------------|
| GET async-search/:id                      | completed     | xx_user | 404 NotFound     | 404 NotFound |               |
| GET async-search/status/:id               | completed     | xx_user | 403 Forbidden    | 404 NotFound | Regression?   |
| GET async-search/status/:id&keep_alive=1m | completed     | xx_user | 403 Forbidden    | 404 NotFound | Regression?   |
|-------------------------------------------+---------------+---------+------------------+--------------+---------------|
| GET async-search/:id                      | running       | yy_user | 404 NotFound     | 404 NotFound |               |
| GET async-search/status/:id               | running       | yy_user | 200 OK           | 200 OK       |               |
| GET async-search/status/:id&keep_alive=1m | running       | yy_user | 404 NotFound     | 404 NotFound | Expected?     |
|-------------------------------------------+---------------+---------+------------------+--------------+---------------|
| GET async-search/:id                      | completed     | yy_user | 404 NotFound     | 404 NotFound |               |
| GET async-search/status/:id               | completed     | yy_user | 200 OK           | 200 OK       |               |
| GET async-search/status/:id&keep_alive=1m | completed     | yy_user | 200 OK           | 200 OK       | Existing bug? |
|-------------------------------------------+---------------+---------+------------------+--------------+---------------|

@quux00 quux00 marked this pull request as ready for review March 26, 2024 12:28
@quux00 quux00 added :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC Team:Security Meta label for security team labels Mar 26, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

@albertzaharovits
Copy link
Contributor

I will look into reviewing this tomorrow morning.

@elasticsearchmachine
Copy link
Collaborator

Hi @tvernum, I've created a changelog YAML for you.

@tvernum
Copy link
Contributor Author

tvernum commented Mar 27, 2024

@elasticmachine update branch

Copy link
Contributor

@albertzaharovits albertzaharovits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, very neat! Thank you @tvernum! 👍

Also thank you @quux00 for the testing from #106638 (comment) ! This was very useful to me!


Re the 403 vs 404 issue. Previously, the 403 was returned because /async-search/status was simply not allowed for users without the monitor cluster privilege, for any async search id.
But the premise of this PR is that we want that to succeed (200), if the same user submitted the async search that's querying the status of. For the other, non-owned async searches, between 403 and 404, 404 is clearly favorable. Some reasons are: consistency with the async-search (result) endpoint and consistency with the /async-search/status endpoint, when Security is disabled.

@quux00 Re the separate bug that you've identified, when the user only has the monitor cluster privilege and is issuing a /async-search/status request with the keep_alive parameter, I believe a quick fix would look like:

diff --git a/x-pack/plugin/async-search/src/main/java/org/elasticsearch/xpack/search/TransportGetAsyncStatusAction.java b/x-pack/plugin/async-search/src/main/java/org/elasticsearch/xpack/search/TransportGetAsyncStatusAction.java
index cc5cd797f3f..cc27e82a693 100644
--- a/x-pack/plugin/async-search/src/main/java/org/elasticsearch/xpack/search/TransportGetAsyncStatusAction.java
+++ b/x-pack/plugin/async-search/src/main/java/org/elasticsearch/xpack/search/TransportGetAsyncStatusAction.java
@@ -35,6 +35,7 @@ import java.util.Objects;

 import static org.elasticsearch.core.Strings.format;
 import static org.elasticsearch.xpack.core.ClientHelper.ASYNC_SEARCH_ORIGIN;
+import static org.elasticsearch.xpack.core.async.AsyncTaskIndexService.getTask;

 public class TransportGetAsyncStatusAction extends HandledTransportAction<GetAsyncStatusRequest, AsyncStatusResponse> {
     private final TransportService transportService;
@@ -76,7 +77,7 @@ public class TransportGetAsyncStatusAction extends HandledTransportAction<GetAsy
             if (request.getKeepAlive() != null && request.getKeepAlive().getMillis() > 0) {
                 long expirationTime = System.currentTimeMillis() + request.getKeepAlive().getMillis();
                 store.updateExpirationTime(searchId.getDocId(), expirationTime, ActionListener.wrap(p -> {
-                    AsyncSearchTask asyncSearchTask = store.getTaskAndCheckAuthentication(taskManager, searchId, AsyncSearchTask.class);
+                    AsyncSearchTask asyncSearchTask = getTask(taskManager, searchId, AsyncSearchTask.class);
                     if (asyncSearchTask != null) {
                         asyncSearchTask.setExpirationTime(expirationTime);
                     }

Would you be available to handle this in a separate fix PR?

@quux00
Copy link
Contributor

quux00 commented Mar 27, 2024

Thanks @albertzaharovits for the review and for the proposed fix to the existing bug. I will open a new PR with the patch you provided.

@quux00
Copy link
Contributor

quux00 commented Mar 27, 2024

I'd like to merge this, but two of the Hdfs tests under fips conditions are failing. I'm able to reproduce one locally, but it also fails on main so seems unrelated to this PR.

./gradlew ':x-pack:plugin:searchable-snapshots:qa:hdfs:javaRestTest' --tests "org.elasticsearch.xpack.searchablesnapshots.hdfs.HdfsSearchableSnapshotsIT" -Dtests.seed=1CD8FD71D0A779E5 -Dtests.locale=en -Dtests.timezone=Etc/UTC -Druntime.java=22 -Dtests.fips.enabled=true

Stack trace of failure:

java.security.InvalidParameterException: Key size for HMAC must be at least 112 bits in approved mode: SHA-1/HMAC
	at __randomizedtesting.SeedInfo.seed([1CD8FD71D0A779E5]:0)
	at org.bouncycastle.jcajce.provider.BaseKeyGenerator.engineInit(Unknown Source)
	at java.base/javax.crypto.KeyGenerator.init(KeyGenerator.java:535)
	at java.base/javax.crypto.KeyGenerator.init(KeyGenerator.java:512)
	at fixture.hdfs2.org.apache.hadoop.security.token.SecretManager.<init>(SecretManager.java:144)
	at fixture.hdfs2.org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.<init>(AbstractDelegationTokenSecretManager.java:120)
	at fixture.hdfs2.org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSecretManager.<init>(DelegationTokenSecretManager.java:95)
	at fixture.hdfs2.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.createDelegationTokenSecretManager(FSNamesystem.java:5868)
	at fixture.hdfs2.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:870)
	at fixture.hdfs2.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:724)
	at fixture.hdfs2.org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1103)
	at fixture.hdfs2.org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:376)
	at fixture.hdfs2.org.apache.hadoop.hdfs.DFSTestUtil.formatNameNode(DFSTestUtil.java:233)
	at fixture.hdfs2.org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:1027)
	at fixture.hdfs2.org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:830)
	at fixture.hdfs2.org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:485)
	at fixture.hdfs2.org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:444)
	at org.elasticsearch.test.fixtures.hdfs.HdfsFixture.tryStartingHdfs(HdfsFixture.java:255)
	at org.elasticsearch.test.fixtures.hdfs.HdfsFixture.startMinHdfs(HdfsFixture.java:193)
	at org.elasticsearch.test.fixtures.hdfs.HdfsFixture.before(HdfsFixture.java:85)
	at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:50)
	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
	at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
	at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
	at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
	at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
	at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
	at java.base/java.lang.Thread.run(Thread.java:1570)

Issue filed here: #106845

@breskeby
Copy link
Contributor

I'd like to merge this, but two of the Hdfs tests under fips conditions are failing. I'm able to reproduce one locally, but it also fails on main so seems unrelated to this PR.

./gradlew ':x-pack:plugin:searchable-snapshots:qa:hdfs:javaRestTest' --tests "org.elasticsearch.xpack.searchablesnapshots.hdfs.HdfsSearchableSnapshotsIT" -Dtests.seed=1CD8FD71D0A779E5 -Dtests.locale=en -Dtests.timezone=Etc/UTC -Druntime.java=22 -Dtests.fips.enabled=true

Stack trace of failure:

java.security.InvalidParameterException: Key size for HMAC must be at least 112 bits in approved mode: SHA-1/HMAC
	at __randomizedtesting.SeedInfo.seed([1CD8FD71D0A779E5]:0)
	at org.bouncycastle.jcajce.provider.BaseKeyGenerator.engineInit(Unknown Source)
	at java.base/javax.crypto.KeyGenerator.init(KeyGenerator.java:535)
	at java.base/javax.crypto.KeyGenerator.init(KeyGenerator.java:512)
	at fixture.hdfs2.org.apache.hadoop.security.token.SecretManager.<init>(SecretManager.java:144)
	at fixture.hdfs2.org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.<init>(AbstractDelegationTokenSecretManager.java:120)
	at fixture.hdfs2.org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSecretManager.<init>(DelegationTokenSecretManager.java:95)
	at fixture.hdfs2.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.createDelegationTokenSecretManager(FSNamesystem.java:5868)
	at fixture.hdfs2.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:870)
	at fixture.hdfs2.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:724)
	at fixture.hdfs2.org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1103)
	at fixture.hdfs2.org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:376)
	at fixture.hdfs2.org.apache.hadoop.hdfs.DFSTestUtil.formatNameNode(DFSTestUtil.java:233)
	at fixture.hdfs2.org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:1027)
	at fixture.hdfs2.org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:830)
	at fixture.hdfs2.org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:485)
	at fixture.hdfs2.org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:444)
	at org.elasticsearch.test.fixtures.hdfs.HdfsFixture.tryStartingHdfs(HdfsFixture.java:255)
	at org.elasticsearch.test.fixtures.hdfs.HdfsFixture.startMinHdfs(HdfsFixture.java:193)
	at org.elasticsearch.test.fixtures.hdfs.HdfsFixture.before(HdfsFixture.java:85)
	at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:50)
	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
	at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
	at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
	at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
	at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
	at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
	at java.base/java.lang.Thread.run(Thread.java:1570)

Issue filed here: #106845

this should be fixed on main (we ignore fips for hdfs). rebasing should fix the hdfs issue for you

@tvernum
Copy link
Contributor Author

tvernum commented Mar 27, 2024

@elasticmachine update branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC Team:Security Meta label for security team v8.14.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants