Skip to content

Empty scroll contexts don't count #86407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
nik9000 opened this issue May 3, 2022 · 5 comments
Open

Empty scroll contexts don't count #86407

nik9000 opened this issue May 3, 2022 · 5 comments
Labels
>enhancement :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@nik9000
Copy link
Member

nik9000 commented May 3, 2022

Description

Right now if you have one thousand shards and hit them all with _scroll it'll always bump into the scroll limit, even if most of those shards don't have matching document. It'd be lovely if we could "not count" shards without any data. I don't think we need to keep any state on those shards.

@nik9000 nik9000 added >enhancement :Search/Search Search-related issues that do not fall into other categories team-discuss labels May 3, 2022
@elasticmachine elasticmachine added the Team:Search Meta label for search team label May 3, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@ywelsch
Copy link
Contributor

ywelsch commented Sep 23, 2022

With PIT being favored over scroll these days, I'm wondering whether this is still worth addressing.

@nik9000
Copy link
Member Author

nik9000 commented Sep 23, 2022

If we're actively working to move off of scroll that's probably fine. Do we have similar PIT limits? I do think we'd have to, say, migrate reindex off of scroll and onto PIT before I'd feel good about ignoring this.

@hchargois
Copy link
Contributor

I've noticed a behavior that I think is linked to what is described in this issue. If I'm mistaken, sorry, feel free to disregard or move to a new issue.

When scrolling with slices, it seems that the number of created contexts is num_slices x num_shards. For example if an index has 10 shards and we scroll with 10 slices, then we get 100 open contexts.

This is surprising to me since the documentation says that slices are first distributed among shards; so as long as num_slices <= num_shards I would expect that each slice should only need to keep a context for the shard(s) that it targets and not for any other; so that overall only num_shards contexts are actually useful.

I don't have an actual knowledge of the internals of slices and scroll contexts, so that's mostly an intuition, maybe I'm completely wrong and all the contexts are actually required.

But anyway even if my understand is wrong, the effects are very real. If I have an index with 100 shards and I want to scroll it with 100 slices (as it seems logical to do), then 10k contexts are created on the cluster, and even with 20 nodes that exceeds the default of 500 open contexts/node.

@javanna javanna added :Search Foundations/Search Catch all for Search Foundations and removed :Search/Search Search-related issues that do not fall into other categories labels Jul 17, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jul 17, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

6 participants