Top level domain extract ingest processor in the default and final pipeline #65722

mbudge · 2020-12-02T10:33:40Z

Hi,

Security use cases require operators to always find logs when searching for a domain during incident response. However, in some feeds like webproxy the domain may be prefixed with "www.", whereas in others like dns this isn't the case. As domain.name or dns.question.name is an extract match keyword field, operators need to do multiple searches to make sure they find all the hits.

Packetbeat does top level domain extract to set the following ECS fields. This is more difficult in other non-beats feeds which get ingested via logstash or a custom importer.

dns.question.registered_domain
dns.question.subdomain
dns.question.top_level_domain
dns.question.domain

There is a logstash plugin but it's a bit of a mess. As noted in this git ticket the documentation looks like it's for a different plugin.
logstash-plugins/logstash-filter-tld#11
https://www.elastic.co/guide/en/logstash/current/plugins-filters-tld.html#plugins-filters-tld-periodic_flush

It would be good if there was an ingest processor which could do TLD extract in the default or final pipeline. Doing this will make it a lot easier to normalise domains in non-beats feeds, and allow operators to find all hits through the *.registered_domain fields.

The processor would probably have to use the public suffix list to do TLD extract.

https://publicsuffix.org/

The rules for doing TLD extract with the public suffix list can be found here
https://publicsuffix.org/list/

elasticmachine · 2020-12-04T12:17:27Z

Pinging @elastic/es-core-features (Team:Core/Features)

danhermann · 2020-12-04T14:02:31Z

@mbudge, I'm closing this as a duplicate of #57476. Please comment if that one does not cover your use case.

mbudge added >enhancement needs:triage Requires assignment of a team area label labels Dec 2, 2020

pgomulka added the :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP label Dec 4, 2020

elasticmachine added the Team:Data Management Meta label for data/management team label Dec 4, 2020

pgomulka removed Team:Data Management Meta label for data/management team needs:triage Requires assignment of a team area label labels Dec 4, 2020

danhermann closed this as completed Dec 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Top level domain extract ingest processor in the default and final pipeline #65722

Top level domain extract ingest processor in the default and final pipeline #65722

mbudge commented Dec 2, 2020

elasticmachine commented Dec 4, 2020

danhermann commented Dec 4, 2020

Top level domain extract ingest processor in the default and final pipeline #65722

Top level domain extract ingest processor in the default and final pipeline #65722

Comments

mbudge commented Dec 2, 2020

elasticmachine commented Dec 4, 2020

danhermann commented Dec 4, 2020