Skip to content

Optimize translog writes by moving the older file deletes in trimUnreferencedReaders call outside the write lock. #55530

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
itiyamas opened this issue Apr 21, 2020 · 4 comments
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard.

Comments

@itiyamas
Copy link

TrimUnreferencedReaders call involves closing and removing old reader files and then updating the checkpoint of current writer to change minimum translog generation. Because the reader references are changed, we make this call under a writeLock to maintain consistency. However, the actual file deletes and file channel close need not be in the write lock once the references are switched and checkpoint updated. The recoverFromFiles can take care of it by deleting the older translog files. Also, the sync can be done once for a batch of files instead of doing it multiple times. This will reduce the total amount of time a writeLock is held.
I did an experiment on ES version 7.4 on a single node i3.8xl. The nyc_taxis rally benchmark was used with a refresh interval of 30seconds and 50 bulk clients.

Here are the results:
The total amount of writeLock time was reduced from close to 800-1000 ms to close to 20ms at a per minute level. This is for an i3.8xl instance which has a fast disk. I suppose the improvement would be higher for slower disks.

The overall improvement in indexing throughput is close to 1% by just moving the delete and channel close calls outside of the lock. The sync was also done just once instead of multiple times as we are anyway within the writeLock.

Let me know if this makes sense.

@itiyamas
Copy link
Author

This is the change that I tested. Not made changes to the recovery part though: https://gist.github.com/itiyamas/0568a4af2b97ac5870cc5f9c3dd7679a

@dnhatn
Copy link
Member

dnhatn commented Apr 21, 2020

Hi @itiyamas,

Thank you for your interest in Elasticsearch.

We discussed this idea before and decided not to pursue it because the cleanup logic will be more complicated and less safe.

Relates to #46203

@dnhatn dnhatn added the :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. label Apr 21, 2020
@dnhatn dnhatn closed this as completed Apr 21, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Engine)

@itiyamas
Copy link
Author

Sorry for not going through it earlier, but this is a bit different from the earlier one.

Here, you are not moving sync outside write lock, but doing it just once within the writeLock. The sync is followed by file deletes outside the write lock. Executing sync outside writeLock is less safe and more complicated as you need to handle a lot of edge cases, but here you just delete a contiguous location of translog files from minGen, which is not complicated in my opinion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard.
Projects
None yet
Development

No branches or pull requests

3 participants