Skip to content

Provide a getFromTranslog method in ShardGetService #95736

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

pxsalehi
Copy link
Member

@pxsalehi pxsalehi commented May 2, 2023

  • AgetFromTranslog in ShardGetService that shortcuts a get once not found in the translog.
  • Track the last unsafe segment generation in InternalEngine.

These two together would be used in a TransportGetFromTranslogAction in the real-time get PR. The aim is to return either a result from the index shard's translog, or if not found, return the minimum segment generation that the search shard to wait for before handling the get request locally. Since the index shard might not have a result for the get in its translog due to being in the append-only mode (which the get request would temporarily disable), we need to return the generation where we have switched to a safe live version map.

Relates ES-5999

@pxsalehi pxsalehi added WIP :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. labels May 2, 2023
@pxsalehi pxsalehi requested review from henningandersen and tlrx May 2, 2023 13:06
@pxsalehi
Copy link
Member Author

pxsalehi commented May 2, 2023

I would like to get some initial feedback if this is how we'd like the approach to look like, before polishing the code or adding tests.
This could be potentially broken down into two PRs, if desired.

@pxsalehi pxsalehi requested a review from Tim-Brooks May 2, 2023 14:55
@pxsalehi pxsalehi removed the WIP label May 3, 2023
@pxsalehi pxsalehi marked this pull request as ready for review May 3, 2023 10:05
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine elasticsearchmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label May 3, 2023
Copy link
Member Author

@pxsalehi pxsalehi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@henningandersen @Tim-Brooks @tlrx This is now cleaned up and I've added test too. Ready for review! :-)

@@ -754,59 +757,90 @@ public GetResult get(
try (ReleasableLock ignored = readLock.acquire()) {
ensureOpen();
if (get.realtime()) {
final VersionValue versionValue;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All I've done here is to break this down into two pieces. Alternative would be probably a flag to shortcut this, but I'm not sure it makes this more readable. Open for suggestions.

// we need to lock here to access the version map to do this truly in RT
versionValue = getVersionFromMap(get.uid().bytes());
var result = realtimeGetUnderLock(get, mappingLookup, documentParser, searcherWrapper);
if (result != null) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The normal get, as before, never returns null.

}
return null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be fine to return an Optional.empty() here and in upstream methods; this would maybe help to distinguish the case when the doc is not found in the translog vs when it's found but deleted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional would be also an option! I just wanted to reduce non-essential changes to the Engine interface since that would change the return type in the upstream methods. The null return value is not affecting the normal Engine.get path. I will add some asserts for it, to make sure. (Alternatively, we could change the return value of get and getFromTranslog to an Optional).

@pxsalehi pxsalehi requested a review from tlrx May 4, 2023 09:59
}
assert versionValue.seqNo >= 0 : versionValue;
refreshIfNeeded("realtime_get", versionValue.seqNo);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this refresh needed when the caller is getFromTranslog? I have the impression that we could avoid it but maybe I am missing a point?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is needed since we read from the internal searcher and we need to reload the reader so we can serve the get. The branching is a bit awkward there, I admit. I'll try to improve that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tlrx I've pushed 8a973f9. Maybe that makes this clearer.

@pxsalehi pxsalehi requested a review from tlrx May 4, 2023 14:14
Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pxsalehi
Copy link
Member Author

pxsalehi commented May 5, 2023

@elasticmachine update branch

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@pxsalehi pxsalehi added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label May 8, 2023
@elasticsearchmachine elasticsearchmachine merged commit f02e9f9 into elastic:main May 8, 2023
@pxsalehi pxsalehi deleted the ps230502-sharGetFromTranslog branch May 8, 2023 12:53
elasticsearchmachine pushed a commit that referenced this pull request May 16, 2023
This is a change broken off from the ongoing real-time GET PR
(#93976).  It is just an
action that can be used to invoke the new
`ShardGetService.getFromTranslog`  in
#95736. It will be used on
the search shards as a first step to handle a real-time get.

Relates #93976, ES-5537
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >non-issue Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants