-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Add enrich source field mapper. #42423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The enrich source field mapper stores the source of a document as binary doc values. This is useful in cases where retrieval speeds are more important than compact storage (which is what SourceFieldMapper does), which is the case for the enrich processor. Prior to this change enrich processor was using _source stored field to fetch the enrich document to enrich document being ingested. The enrich policy runner, when creating the enrich index, disables _source meta field and enables the _enrich_source meta field. The enrich source field mapper is an internal field, which is only meant to be used by the enrich feature. Relates to elastic#41521 and elastic#32789
Pinging @elastic/es-core-features |
Running benchmark using this track with this new meta field type already shows good performance improvements:
(the baseline is what is in the enrich branch and the contender what is in this PR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good. Something I'm wondering is if we can prevent users from using this field on their regular indices?
|
||
@Override | ||
public SortedBinaryDocValues getBytesValues() { | ||
return new SortedBinaryDocValues() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use FieldData#singleton(values)
Thanks for reviewing @jpountz!
Perhaps enforcing that the index name should start with |
👍 |
This change is no longer relevant. |
The enrich source field mapper stores the source of a document as
binary doc values. This is useful in cases where retrieval speeds
are more important than compact storage (which is what SourceFieldMapper does),
which is the case for the enrich processor.
Prior to this change enrich processor was using _source stored field
to fetch the enrich document to enrich document being ingested.
The enrich policy runner, when creating the enrich index, disables
_source meta field and enables the _enrich_source meta field.
The enrich source field mapper is an internal field, which is only
meant to be used by the enrich feature.
Relates to #41521 and #32789