Skip to content

Search: Expose Lucene's searchAfter in the search API #8192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jpountz opened this issue Oct 22, 2014 · 2 comments
Closed

Search: Expose Lucene's searchAfter in the search API #8192

jpountz opened this issue Oct 22, 2014 · 2 comments
Labels
>feature :Search/Search Search-related issues that do not fall into other categories v5.0.0-alpha1

Comments

@jpountz
Copy link
Contributor

jpountz commented Oct 22, 2014

We already integrated IndexSearcher.searchAfter in #4940 in order to make deep pagination more efficient with the scroll API. However, in quite a number of cases using the scroll API is not possible as it is heavy and requires to associate a scroll context to each search request and to clear this context when it is not needed anymore.

So if you have a user-facing application that needs to perform deep pagination, performance is terrible because of the pagination, and you cannot really use the scroll API since scroll contexts are costly and users typically don't explicitely tell the application when they don't need the context anymore.

A middle ground could be to allow configuring an array of sort values, and we would only search after these sort values. Compared to the scroll API, it would have the downside of not always requesting the same point-in-time view of a shard, so you can miss documents because of deletes or see documents twice because of insertions, but you already have this issue when paginating using from/size. On the other hand, performance could be much better since it would allow to manage smaller priority queues on each shard.

NOTE: in order for this feature to work well with pagination, the _uid should be used as a last element of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined (it is defined in the case of Lucene using doc ids, but this doesn't work with elasticsearch because we do not always query the same shard, and if a merge happens between 2 requests, doc ids could be reordered)

@nik9000
Copy link
Member

nik9000 commented Oct 22, 2014

Neat!

@clintongormley
Copy link
Contributor

Related to #7881

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature :Search/Search Search-related issues that do not fall into other categories v5.0.0-alpha1
Projects
None yet
Development

No branches or pull requests

5 participants