Skip to content

Expose API key metadata for ingest processors #71024

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ywangd opened this issue Mar 30, 2021 · 8 comments · Fixed by #72137
Closed

Expose API key metadata for ingest processors #71024

ywangd opened this issue Mar 30, 2021 · 8 comments · Fixed by #72137
Assignees
Labels
>enhancement :Security/Security Security issues without another label Team:Security Meta label for security team

Comments

@ywangd
Copy link
Member

ywangd commented Mar 30, 2021

API keys can now be created with metadata since #70292. This information can be retrieved with the GetApiKey request, but otherwise is not available anywhere else. It is desirable that the metadata is exposed in the ingest processors, specifically the SetSecurityUser processor. The processor already has a api_key property that exposes the API key name and ID. One option is to expand the existing property and add the metadata field.

@ywangd ywangd added >enhancement :Security/Security Security issues without another label labels Mar 30, 2021
@elasticmachine elasticmachine added the Team:Security Meta label for security team label Mar 30, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

@ywangd ywangd self-assigned this Apr 19, 2021
@ywangd
Copy link
Member Author

ywangd commented Apr 19, 2021

EDIT: updated to incorporate Tim's comment

@elastic/es-security As discussed in team meeting, here is my proposal on how to expose the API key metadata information in ingest processor.

Current State

The set security user processor is where authentication information is attached to the incoming document. It is therefore the right choice for exposing the API key metadata information. Today, it already has a api_key property to add the API key name and id when the authentication is an API key. An enriched document would have the following structure (assuming API key authentication and set_security_user property is user):

{
  ...,
  "_source": {
    ...,
    "user": {
      "api_key": {
        "name": "k1",
        "id": "0UB86HgB96240ypp1EqC"
      },
      "realm": {
        "name": "file1",
        "type": "file"
      },
      "authentication_type": "API_KEY",
      "username": "elastic-admin"
    }
  }
}

Recommendations

Three options are considered for adding the extra metadata information:

  1. Add metadata as part of the existing api_key property and whether it is shown is also controlled by the api_key property. So the input is:
"set_security_user": {
  "field": "user",
  "properties": ["api_key"]
}

And the output is as the follows:

      "api_key": {
        "name": "k1",
        "id": "0UB86HgB96240ypp1EqC",
        "metadata": {
          "foo": "bar",
          "value": "42"
        }
      }

For minimal user visiable impact, the metadata field will not show if the API key has no metadata (either null or an empty map).

  1. Add metadata as a new top level field api_key_metadata, i.e. at the same level of the existing api_key field. Whether it is shown is controlled by a new property with the same name of api_key_metadata. So the input is:
"set_security_user": {
  "field": "user",
  "properties": ["api_key", "api_key_metadata"]
}

And the output is:

      "api_key": {
        "name": "k1",
        "id": "0UB86HgB96240ypp1EqC"
      },
      "api_key_metadata": {
        "foo": "bar",
        "value": "42"
      }

For minimal user visiable impact, the api_key_metadata field will not show if the API key has no metadata (either null or an empty map).

  1. This one is kinda of mix of Options 1 and 2: Add the metadata field under existing api_key, but control whether it is shown with a new property api_key_metadata. So the output is identical to Option 1, but the input is the same as Option 2.

(We could think of more options, but I feel they would be too complex to be worthwhile.)

I propose we go with Option 1.

Analysis

There are pros and cons for all three approaches, but overall Option 1 is simpler and more effective.

Option 1

Pros

  • Minimal code level change
  • Simpler configuration on user side, i.e. no new property configuration is needed, and easier to document
  • Better clarity. For new users, it is likely counter-intuitive to have a field named api_key which does not include metadata.
  • The structure (id, name and metadata at the same level) is consistent with how these information is specified and stored for API keys.
  • The existing api_key name is also metadata. So it makes sense that the new metadata is added alongside it.

Cons

  • Coarser grained, the existing api_key fields and the metadata will always show or hide together. It could be quite verbose if the metadata object is large.

Option 2

The Pros and Cons of Option 2 is basically the opposite to those of Option 1 plus:
Cons

  • If we add API key metadata without the existing id field, there is a chance that this information could be mis-used because API key metadata is not unique or reserved. By itself, it cannot be used to identify an API key. Hence showing only metadata without id is problematic and could be misleading. This means we have to enforce api_key if api_key_metadata is specified, which introduces extra complexity and leniency.

Option 3

Pros

  • Same as Option 1: Consistent structure to how information is specified and stored for API keys
  • Same as Option 1: The existing api_key name is also metadata. So it makes sense that the new metadata is added alongside it.
  • Same as Option 2: Finer grained

Cons

  • Same as Option 2: We likely need to automatically add api_key if api_key_metadata if specified (implies complexity and leniency)
  • It is somewhat a deviation from existing properties whose fields are at top level of the SetSecurityUser field.

Overall, the cons of Option 2 or 3 weigh more than those of Option 1. Also some more reasonings are:

  • +1 for Option 1 and 3: We may want to show the API key metadata in authenticate response as well. Based on the discussion about recent work of service account, we could see having a new top level api_key field in the authenticate response. It is likely that API key metadata will be nested under it. So Option 1 and 3 would have better consistency.
  • -1 for Option 2 and 3: If the metadata is indeed large, it is likely that a top level switch for the entire object is not granular enough. This makes Option 2 or 3 less attractive. In addition, we have remove processor which is probably more flexible for metadata filtering (thanks to @colings86).
  • -1 for Option 2 and 3: The minimal configuration of SetSecurityUser processor is to not explicitly specify any properties. This defaults to show all properties. I suspect this is a common case. If so, having a new api_key_metadata property does not really help with verbosity since by default users do not specify any properties.

@bytebilly
Copy link
Contributor

Thanks @ywangd, I'd prefer option 1 that looks clearer and simpler from a user perspective.

@tvernum
Copy link
Contributor

tvernum commented Apr 19, 2021

Your proposal seems to conflate 2 different design questions:

  1. What determines whether to include API key metadata in the set of fields that the processor adds
  2. What is the JSON structure when the metadata is shown.

Your examples describe question (2), but some of your pro/cons are about question (1).
I think that means there are actually 4 options (though some are not very sensible):

a. Use the existing API_KEY Property to control this, and add a metadata field into the api_key object of the user object. (Your option 1)
b. Use the existing API_KEY Property to control this, and add a new api_key_metadata field to the user object. (Not a sensible option)
c. Add new API_KEY_METADATA Property to control this, and if set, add a metadata field into the api_key object of the user object. (A new, possible option)
d. Add new API_KEY_METADATA Property to control this, and if set, add a new api_key_metadata field to the user object. (Your option 2)

I think option (b) has no redeeming qualities, but I do think you should consider (c) as a potential option.

I think I prefer (a), over (c) or (d). However, I don't think (d) is naturally better than (c), so I think it's worth being clear about which of the pros/cons of Option 2 are actually about (c) or (d).

TL;DR: I like your option 1, but I think there's 2 different ways you could do option 2, and that might change your (or other people's) analysis/recommendation.

@colings86
Copy link
Contributor

+1 for option 1 (or option a 😄 ). I can see a case for Tim's option c but I think we need to see how metadata is used, if metadata becomes large it might be an option to go with but I could equally see users wanting to filter the metadata and we might need to think about whether thats best done in this processor or in a subsequent filtering processor (I'm currently leaning to the latter)

@ywangd
Copy link
Member Author

ywangd commented Apr 19, 2021

Your proposal seems to conflate 2 different design questions

@tvernum You are right that a viable option is missed out. The existing fields got me to assume that every property should correspond to a top level field. Apparently, this does not have to be the case. Though it does mean deviation from existing pattern, it is at least worth for discussion.

I updated my original post to reflect your comment (a new Option 3 is added, i.e. your option (c)). I hope it is easier for people to consume when everything is in a single post. Please note that my recommendation remains the same. Thanks!

@jkakavas
Copy link
Member

Thanks for the thorough analysis Yang and for laying this out nicely for consideration. I'll add my +1 to 1/a as I dont like 2(because I think it makes sense for metadata to be a property of the api key and not a high level one ) and I dont see the need for the granularity c offers ,especially in the presence of other filtering solutions that are available

@ywangd
Copy link
Member Author

ywangd commented Apr 23, 2021

Thanks all for the feedback. The consensus is that overall we prefer Option 1, the simplest approach. I will raise a PR for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Security/Security Security issues without another label Team:Security Meta label for security team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants