Skip to content

[FR] Better way to work with non-ecs fields #3262

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
stevengoossensB opened this issue Nov 8, 2023 · 10 comments
Closed

[FR] Better way to work with non-ecs fields #3262

stevengoossensB opened this issue Nov 8, 2023 · 10 comments
Assignees
Labels

Comments

@stevengoossensB
Copy link

I'm always frustrated when we use a non-ecs field in our custom rules and I need to manually update the non-ecs-schema.json to include that field.

I would like a possibility to add a field with all of it's subfields, or even a full index to the schema, so the fields don't get checked for ECS compliance, so we can exclude certain indices from ECS validation or put all non-ecs fields as subfields to one field.

@stevengoossensB stevengoossensB added the enhancement New feature or request label Nov 8, 2023
@Mikaayenson
Copy link
Contributor

👋 Hey @stevengoossensB we hear you! We started reviewing this file a while back in (see #1776 for context). TLDR; The main issue at the time was integration fields sometimes changed between stack versions, and since we backport, forcing us to use this file in some cases.

We also explored migrating these fields to the integrations themselves (by opening PRs, e.g. elastic/integrations#5115), which still is a good idea long term (given the fields match older stacks that we support).

Can you provide an example field your missing and the full index to the schema? The other options you mentioned may be viable options to solve your issue with subfields.

@stevengoossensB
Copy link
Author

stevengoossensB commented Nov 8, 2023

My specific example required manually adding the field "panw.panos.action" to the log stream "logs-panw.panos-*." However, I anticipate this issue arising in other scenarios as well.

To accommodate the diverse data sources and compliance levels encountered in customer environments, perhaps the tool could introduce an option that loosens ECS compliance requirements. This would enable the tool's validation and deployment capabilities to be utilized more effectively.

@Mikaayenson
Copy link
Contributor

We're planning some future work where we take into account these types of things. In short, we plan to make it easy to bypass some of these checks (e.g. potentially environment variables set to bypass). As we get more issues similar to this we bump the priority, so thank you for raising the issue.

Would something like an environment variable bypass help in your situation?

@stevengoossensB
Copy link
Author

Bypass would work, skipping that validation, or returning warnings rather then errors (it is still good to know where we are not ECS compliant and where we could improve in the future).

@slawomirbabicz
Copy link

What I am doing personally as a hack is:

1). Created file: non-ecs-schema_extension.json

{ 
  "logs*": {
      "github.repo": "keyword",
      "github.org": "keyword",
      "github.repository_public": "boolean",
      "github.category": "keyword",
      "github.permission": "keyword",
      "json.action.type": "keyword"
  }
}

2). I am appending that file to: /detection-rules/detection_rules/etc/non-ecs-schema.json

@stevengoossensB
Copy link
Author

I'm doing something similar, by overwriting the non-ecs-schema.json file with my own version. However, in an environment with plenty of index patterns, different custom data sources, custom fields that are used in multiple index patterns,... It becomes a bit of a mess.

Maybe there could be a way to just add the exception filed names, regardless of the index, to a list (by specifying them in the * index?)

@SHolzhauer
Copy link
Contributor

Hi,

We have hundreds of custom fields, which when using in detection rules we manually add to the non-ecs-schema file.
This is not the only "issue" with the current approach though.
If we use sub datastreams, for example logs-aws* as a means of reducing load on our cluster, we also have to specify the ECS fields.
We don't want to bypass the test though, validating if a query works against the expected field types is def. a good practice.

There are a couple of possible improvements i think:

  1. Enable sub indices to work automagically (e.g logs-aws* should behave the same as logs-* from ecs perspective)
  2. Read in an additional [email protected] file which users can create themselves (to prevent git version conflicts etc).
  3. Enable "flattened" mappings. e.g my.custom.*: "keyword" should result in all fields being treated as a keyword.

I.m.h.o this will improve the experience, introduce a bit of flexibility and still support the usage of the test case.

@botelastic
Copy link

botelastic bot commented Jan 26, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@eric-forte-elastic
Copy link
Contributor

eric-forte-elastic commented May 2, 2024

This will be addressed through the DAC feature branch Bring Your Own Schema support #3618.

@eric-forte-elastic
Copy link
Contributor

With the new feature from #3889, we now support custom shemas and the auto generation of custom schemas (see A11 from our FAQ for an example). If you run into any issues with custom schemas or auto generation please feel free to re-open this issue or create a new one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants