Skip to content

Top level domain extract ingest processor in the default and final pipeline #65722

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mbudge opened this issue Dec 2, 2020 · 2 comments
Closed
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement

Comments

@mbudge
Copy link

mbudge commented Dec 2, 2020

Hi,

Security use cases require operators to always find logs when searching for a domain during incident response. However, in some feeds like webproxy the domain may be prefixed with "www.", whereas in others like dns this isn't the case. As domain.name or dns.question.name is an extract match keyword field, operators need to do multiple searches to make sure they find all the hits.

Packetbeat does top level domain extract to set the following ECS fields. This is more difficult in other non-beats feeds which get ingested via logstash or a custom importer.

  • dns.question.registered_domain
  • dns.question.subdomain
  • dns.question.top_level_domain
  • dns.question.domain

There is a logstash plugin but it's a bit of a mess. As noted in this git ticket the documentation looks like it's for a different plugin.
logstash-plugins/logstash-filter-tld#11
https://www.elastic.co/guide/en/logstash/current/plugins-filters-tld.html#plugins-filters-tld-periodic_flush

It would be good if there was an ingest processor which could do TLD extract in the default or final pipeline. Doing this will make it a lot easier to normalise domains in non-beats feeds, and allow operators to find all hits through the *.registered_domain fields.

The processor would probably have to use the public suffix list to do TLD extract.

https://publicsuffix.org/

The rules for doing TLD extract with the public suffix list can be found here
https://publicsuffix.org/list/

@mbudge mbudge added >enhancement needs:triage Requires assignment of a team area label labels Dec 2, 2020
@pgomulka pgomulka added the :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP label Dec 4, 2020
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Dec 4, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (Team:Core/Features)

@pgomulka pgomulka removed Team:Data Management Meta label for data/management team needs:triage Requires assignment of a team area label labels Dec 4, 2020
@danhermann
Copy link
Contributor

@mbudge, I'm closing this as a duplicate of #57476. Please comment if that one does not cover your use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement
Projects
None yet
Development

No branches or pull requests

4 participants