Skip to content

add exclude_keys option to KeyValueProcessor #24876

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 5, 2017

Conversation

talevy
Copy link
Contributor

@talevy talevy commented May 24, 2017

adds exclude_keys option, just like Logstash allows. Where, if included, a field will not show up in the resulting document. Attempts to preserve exact behavior.

Closes #23856

@talevy talevy added :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v6.0.0 >enhancement labels May 24, 2017
@talevy talevy force-pushed the kv-exclude-keys branch from 1166b4c to 6362eb3 Compare May 25, 2017 00:11
@talevy talevy added the review label May 25, 2017
@talevy talevy requested a review from martijnvg May 25, 2017 16:21
Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment about the data-structure to use to look up keys.

.filter((p) -> includeKeys == null || includeKeys.contains(p[0]))
.filter((p) ->
(includeKeys == null || includeKeys.contains(p[0])) &&
(excludeKeys == null || excludeKeys.contains(p[0]) == false))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those contains calls have a linear complexity due to the fact we are using lists, let's use hash sets instead? (and maybe even automata as a follow-up?)

Copy link
Contributor Author

@talevy talevy May 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I suppose I assumed the cost is minimal since this list is not expected to be more than just a few items... but that is being multiplied by millions of documents, so good call

updated

@talevy talevy force-pushed the kv-exclude-keys branch from 6362eb3 to c077c5a Compare May 30, 2017 17:07
and modify data-structure of `include_keys` and `exclude_keys` to be
backed by a HashSet
@talevy talevy force-pushed the kv-exclude-keys branch from c077c5a to c8142d9 Compare May 30, 2017 17:07
@talevy
Copy link
Contributor Author

talevy commented May 31, 2017

@jpountz thanks for the review! what do you think about the follow-up changes?

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@talevy talevy added the v5.5.0 label Jun 5, 2017
@talevy talevy merged commit e512460 into elastic:master Jun 5, 2017
@talevy talevy deleted the kv-exclude-keys branch June 5, 2017 21:12
talevy added a commit that referenced this pull request Jun 5, 2017
and modify data-structure of `include_keys` and `exclude_keys` to be
backed by a HashSet
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jun 6, 2017
* master:
  Add support for clear scroll to high level REST client (elastic#25038)
  Tiny correction in inner-hits.asciidoc (elastic#25066)
  Added release notes for 6.0.0-alpha2
  Expand index expressions against indices only when managing aliases (elastic#23997)
  Collapse inner hits rest test should not skip 5.x
  Settings: Fix secure settings by prefix (elastic#25064)
  add `exclude_keys` option to KeyValueProcessor (elastic#24876)
  Test: update missing body tests to run against versions >= 5.5.0
  Track EWMA[1] of task execution time in search threadpool executor
  Removes an invalid assert in resizing big arrays which does not always hold (resizing can result in a smaller size than the current size, while the assert attempted to verify the new size is always greater than the current).
  Fixed NPEs caused by requests without content. (elastic#23497)
  Plugins can register pre-configured char filters (elastic#25000)
  Build: Allow preserving shared dir (elastic#24962)
  Tests: Make secure settings available from settings builder for tests (elastic#25037)
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jun 6, 2017
* master: (1210 commits)
  Add support for clear scroll to high level REST client (elastic#25038)
  Tiny correction in inner-hits.asciidoc (elastic#25066)
  Added release notes for 6.0.0-alpha2
  Expand index expressions against indices only when managing aliases (elastic#23997)
  Collapse inner hits rest test should not skip 5.x
  Settings: Fix secure settings by prefix (elastic#25064)
  add `exclude_keys` option to KeyValueProcessor (elastic#24876)
  Test: update missing body tests to run against versions >= 5.5.0
  Track EWMA[1] of task execution time in search threadpool executor
  Removes an invalid assert in resizing big arrays which does not always hold (resizing can result in a smaller size than the current size, while the assert attempted to verify the new size is always greater than the current).
  Fixed NPEs caused by requests without content. (elastic#23497)
  Plugins can register pre-configured char filters (elastic#25000)
  Build: Allow preserving shared dir (elastic#24962)
  Tests: Make secure settings available from settings builder for tests (elastic#25037)
  [TEST] Skip wildcard expansion test due to breaking change
  Test that gradle and Java version types match (elastic#24943)
  Include duplicate jar when jarhell check fails
  Change ScriptContexts to use needs instead of uses$. (elastic#25036)
  Change `has_child`, `has_parent` queries and `childen` aggregation to work with the new join field type and at the same time maintaining support for the `_parent` meta field type.
  Remove comma-separated feature parsing for GetIndicesAction
  ...
jasontedor added a commit to zeha/elasticsearch that referenced this pull request Jun 6, 2017
* master: (619 commits)
  Add support for clear scroll to high level REST client (elastic#25038)
  Tiny correction in inner-hits.asciidoc (elastic#25066)
  Added release notes for 6.0.0-alpha2
  Expand index expressions against indices only when managing aliases (elastic#23997)
  Collapse inner hits rest test should not skip 5.x
  Settings: Fix secure settings by prefix (elastic#25064)
  add `exclude_keys` option to KeyValueProcessor (elastic#24876)
  Test: update missing body tests to run against versions >= 5.5.0
  Track EWMA[1] of task execution time in search threadpool executor
  Removes an invalid assert in resizing big arrays which does not always hold (resizing can result in a smaller size than the current size, while the assert attempted to verify the new size is always greater than the current).
  Fixed NPEs caused by requests without content. (elastic#23497)
  Plugins can register pre-configured char filters (elastic#25000)
  Build: Allow preserving shared dir (elastic#24962)
  Tests: Make secure settings available from settings builder for tests (elastic#25037)
  [TEST] Skip wildcard expansion test due to breaking change
  Test that gradle and Java version types match (elastic#24943)
  Include duplicate jar when jarhell check fails
  Change ScriptContexts to use needs instead of uses$. (elastic#25036)
  Change `has_child`, `has_parent` queries and `childen` aggregation to work with the new join field type and at the same time maintaining support for the `_parent` meta field type.
  Remove comma-separated feature parsing for GetIndicesAction
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement v5.5.0 v6.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants