Skip to content

Inconsistent segment stats with soft-deletes #43103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ywelsch opened this issue Jun 11, 2019 · 4 comments
Closed

Inconsistent segment stats with soft-deletes #43103

ywelsch opened this issue Jun 11, 2019 · 4 comments
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement

Comments

@ywelsch
Copy link
Contributor

ywelsch commented Jun 11, 2019

Segment stats are currently inconsistent when it comes to (soft-)deletes. The number of deletes are counted in a different way for non-searchable committed segment infos than for searchable ones, e.g. soft-deletes are not treated as deletes in committed segment infos.

We should make sure to have segment infos record the number of soft deletes as well, so that we return consistent numbers, and preferably also return information regarding the count of soft-deletes vs hard-deletes in the API.

@ywelsch ywelsch added >enhancement :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. labels Jun 11, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@dnhatn
Copy link
Member

dnhatn commented Jun 11, 2019

I opened #43126 to propose the fix for the delete count issue.

@dnhatn
Copy link
Member

dnhatn commented Jun 13, 2019

and preferably also return information regarding the count of soft-deletes vs hard-deletes in the API.

At first glance, these numbers would be useful. However, soft-deletes and hard-deletes are exclusive. If soft-deletes is enabled, we won't have hard-deletes (except situations where we hit non-abortable exceptions during indexing) and vice versa.

dnhatn added a commit that referenced this issue Jun 14, 2019
This change fixes the delete count issue in segment stats where we don't
account soft-deleted documents from committed segments.

Relates #43103
dnhatn added a commit that referenced this issue Jun 15, 2019
This change fixes the delete count issue in segment stats where we don't
account soft-deleted documents from committed segments.

Relates #43103
dnhatn added a commit that referenced this issue Jun 15, 2019
This change fixes the delete count issue in segment stats where we don't
account soft-deleted documents from committed segments.

Relates #43103
dnhatn added a commit that referenced this issue Jun 15, 2019
This change fixes the delete count issue in segment stats where we don't
account soft-deleted documents from committed segments.

Relates #43103
dnhatn added a commit that referenced this issue Jun 17, 2019
This change fixes the delete count issue in segment stats where we don't
account soft-deleted documents from committed segments.

Relates #43103
@ywelsch
Copy link
Contributor Author

ywelsch commented Jun 17, 2019

ok, sounds good.

@ywelsch ywelsch closed this as completed Jun 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement
Projects
None yet
Development

No branches or pull requests

3 participants