Skip to content

Copy normalisers for keyword fields to rollup indexes #30996

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
colings86 opened this issue May 31, 2018 · 3 comments
Closed

Copy normalisers for keyword fields to rollup indexes #30996

colings86 opened this issue May 31, 2018 · 3 comments
Labels
>enhancement :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@colings86
Copy link
Contributor

At the moment rollups do not copy normalisers for keyword fields over to the rollup index. This means that will the introduction of #30973 enabling the match query to be used against rollup indexes the match query will work differently on live indexes to rollup indexes. The difference will be that a match query against a keyword field in a live index will get a configured normaliser applied to the query text and will match documents, but running the same query against a rollup index will potentially not match documents because the normaliser will not be applied to the query text.

For example:
If I have a field that has a lowercasing normaliser on the live index and do a rollup search across live and rollup indexes for mySQL, it will find documents in the live index and return them but it will look like there are no matching documents in the rollup index because the same normalisation was not applied on the rollup index.

We should make a change to the index side of rollups to copy over any normalisers that are configure on keyword fields we are using in the rollups.

@colings86 colings86 added >enhancement blocker v7.0.0 :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data v6.4.0 labels May 31, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@jimczi
Copy link
Contributor

jimczi commented Jul 31, 2018

We've decided to not allow match queries in rollup at the moment. text fields are not handled in rollups so the need for this type of query is still unclear. We still need to handle normalizer in rollups for keyword field since the term query uses them, I removed the blocker label but we should at least document the limitation.

@lcawl lcawl added v6.4.1 and removed v6.4.0 labels Aug 23, 2018
@jasontedor jasontedor added v8.0.0 and removed v7.0.0 labels Feb 6, 2019
@rjernst rjernst added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label May 4, 2020
@jimczi jimczi removed their assignment Dec 16, 2020
@polyfractal polyfractal removed their assignment Mar 18, 2021
@wchaparro
Copy link
Member

With the 8.7 release of Elasticsearch, we have made a new downsampling capability associated with the new time series datastreams functionality generally available (GA). This capability was in tech preview in ILM since 8.5. Downsampling provides a method to reduce the footprint of your time series data by storing it at reduced granularity. The downsampling process rolls up documents within a fixed time interval into a single summary document. Each summary document includes statistical representations of the original data: the min, max, sum, value_count, and average for each metric. Data stream time series dimensions are stored unchanged.

Downsampling is superior to rollup because:

  • Downsampled indices are searched through the _search API
  • It is possible to query multiple downsampled indices together with raw data indices
  • The pre-aggregation is based on the metrics and time series definitions in the index mapping so very little configuration is required (i.e. much easier to add new time serieses)
  • Downsampling is managed as an action in ILM
  • It is possible to downsample a downsampled index, and reduce granularity as the index ages
  • The performance of the pre-aggregation process is superior in downsampling, as it builds on the time_series index mode infrastructure

Because of the introduction of this new capability, we are deprecating the rollups functionality, which never left the Tech Preview/Experimental status, in favor of downsampling and thus we are closing this issue. We encourage you to migrate your solution to downsampling and take advantage of the new TSDB functionality.

@wchaparro wchaparro closed this as not planned Won't fix, can't repro, duplicate, stale Jun 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

No branches or pull requests

8 participants