Skip to content

Security: improve exact index matching performance #36017

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 29, 2018

Conversation

jaymode
Copy link
Member

@jaymode jaymode commented Nov 28, 2018

This commit improves the efficiency of exact index name matching by
separating exact matches from those that include wildcards or regular
expressions. Internally, exact matching is done using a HashSet instead
of adding the exact matches to the automata. For the wildcard and
regular expression matches, the underlying implementation has not
changed.

This commit improves the efficiency of exact index name matching by
separating exact matches from those that include wildcards or regular
expressions. Internally, exact matching is done using a HashSet instead
of adding the exact matches to the automata. For the wildcard and
regular expression matches, the underlying implementation has not
changed.
@jaymode jaymode added >enhancement v7.0.0 :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC v6.6.0 labels Nov 28, 2018
@jaymode jaymode requested a review from tvernum November 28, 2018 19:33
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security

Copy link
Contributor

@tvernum tvernum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Set<String> exactMatch = new HashSet<>();
List<String> nonExactMatch = new ArrayList<>();
for (String indexPattern : indices) {
if (indexPattern.startsWith("/")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing Automatons code throws an exception if a pattern starts with / but doesn't end with /.
So we could simplify all this checking down to a single if/else and still be compatible with existing behaviour.

if (indexPattern.startsWith("/") || indexPattern.contains("*") || indexPattern.contains("?")) {
    nonExactMatch.add(indexPattern);
} else {
    exactMatch.add(indexPattern);
}

I don't care strongly, but shorter and simpler seems nicer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ thanks for noticing that. I misread the if statement in Automatons

@tvernum
Copy link
Contributor

tvernum commented Nov 29, 2018

I think as a follow up, when we have more time, we should:

  1. Centralise this behaviour, and use it whereever we do pattern matching. It might mean putting a new Patterns class in front of Automatons that handles this logic.
  2. Do some benchmarking about whether "*xyz*" is more efficient to run as str.contains("xyz") (and similarly startsWith and endsWith) and include that into the hypothetical Patterns utility class. I suspect, without testing that we pay a high price for using Automaton for trivial patterns.

@jaymode jaymode merged commit ba3ee98 into elastic:master Nov 29, 2018
@jaymode jaymode deleted the indices_exact_match_perf branch November 29, 2018 18:37
jaymode added a commit that referenced this pull request Nov 29, 2018
This commit improves the efficiency of exact index name matching by
separating exact matches from those that include wildcards or regular
expressions. Internally, exact matching is done using a HashSet instead
of adding the exact matches to the automata. For the wildcard and
regular expression matches, the underlying implementation has not
changed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants