Skip to content

Fix realtime get of nested fields with synthetic source (#119575) #120247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 16, 2025

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Jan 16, 2025

Today, for get-from-translog operations, we only need to reindex the root document into an in-memory Lucene, as the _source is stored in the root document and is sufficient. However, synthesizing the source for nested fields requires both the root document and its child documents. This causes realtime-get operations (as well as update and update-by-query operations) to miss nested fields.

Another issue is that the translog operation is reindexed lazily during get-from-translog operations. As a result, two realtime-get operations can return slightly different outputs: one reading from the translog and the other from Lucene.

This change resolves both issues. However, addressing the second issue can degrade the performance of realtime-get and update operations. If slight inconsistencies are acceptable, the translog operation should be reindexed lazily instead.

Closes #119553

Today, for get-from-translog operations, we only need to reindex the
root document into an in-memory Lucene, as the _source is stored in the
root document and is sufficient. However, synthesizing the source for
nested fields requires both the root document and its child documents.
This causes realtime-get operations (as well as update and
update-by-query operations) to miss nested fields.

Another issue is that the translog operation is reindexed lazily during
get-from-translog operations. As a result, two realtime-get operations
can return slightly different outputs: one reading from the translog and
the other from Lucene.

This change resolves both issues. However, addressing the second issue
can degrade the performance of realtime-get and update operations. If
slight inconsistencies are acceptable, the translog operation should be
reindexed lazily instead.

Closes elastic#119553
@dnhatn dnhatn added :StorageEngine/Mapping The storage related side of mappings >bug backport labels Jan 16, 2025
@dnhatn dnhatn marked this pull request as ready for review January 16, 2025 02:07
@dnhatn dnhatn merged commit 0d6eec8 into elastic:8.x Jan 16, 2025
15 checks passed
@dnhatn dnhatn deleted the 8.x-119575 branch January 16, 2025 02:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport >bug :StorageEngine/Mapping The storage related side of mappings v8.18.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants