-
Notifications
You must be signed in to change notification settings - Fork 25.2k
search_after unexpected/undocumented behaviour #34232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging @elastic/es-search-aggs |
@jimczi, could you have a look at this? |
This is the expected behavior. The sort values are used to filter documents that compare smaller (or greater depending on the order of the sort). If you want to ensure that the response starts exactly at the |
Sorry for the late reply! And thank you for answering so quickly Yes, the reason I was encountering confusion and/or issues is that the same module is being included into multiple classes. It wasn't an issue as long as the models we were using were only relying on a primary key. However it became one as soon as we introduced an elasticsearch index for a model using a composite key (DynamoDB). Let's say those keys are called The common module we were using was running the All tests were passing since it was correctly paginating even though the actual Not long after I noticed reviewing the code that I was using In fact it was because the So if we had a document like:
and we ran a I'm not saying this behaviour is wrong, but It would have saved me so much time if it was documented. I don't feel it as a straightforward and obvious way of working. I think there should be a couple of words regarding it. Overall I think you did a great job with elastcisearch and it's within my intentions to try to help you improve it, even by a little margin, if possible. Also, sorry if I sound so repetitive and specific, I'm just trying to give you the best explanation of the situation to help you understand. Thank you and let me know! |
Thanks, your help is welcome ! I agree that we could add a small note regarding how we handle the provided sort values in |
I think the most fitting place for such information would be this doc page. Do you think it would be possible? However, if we are talking about this repo I'll create a PR in the next couple of days if I find a fitting place. Thank you |
Yes that's possible, the doc page is in this repo, you can find it here and create a PR to modify it. |
I created a PR Let me know. |
Elasticsearch version (
curl localhost:9200
):"version" : {
"number" : "6.2.4",
"build_hash" : "ccec39f",
"build_date" : "2018-04-12T20:37:28.497551Z",
"build_snapshot" : false,
"lucene_version" : "7.2.1",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
}
Plugins installed: []
JVM version (
java -version
):openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-0ubuntu0.16.04.1-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)
OS version (
uname -a
if on a Unix-like system):Linux ********* 4.4.0-127-generic #153-Ubuntu SMP Sat May 19 10:58:46 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Description of the problem including expected versus actual behavior:
When using the search_after API, it doesn't look for an exact match on
_id
but for a partial one.More specific example:
I've set a query like so (ruby):
So I'd expect ElasticSearch to look for a record with an
_id
offd5e06f3-ded0-4cc1-8dc5-f798f3165ca2
.However using:
OR
Respectively the first part of the UUID and the last part of the UUID the same result is yielded, at least for my data.
Basically what I'm understanding is that the matching between the
_id
and the value I provided for thesearch_after
API is not an exact (boolean) match, but rather a partial one. If the string I'm passing fits completely into an_id
of one of the records, then the results after that specific record get yielded.The expected behaviour is to not have a partial matching but an exact (or boolean) matching. Therefore only the record with the same exact
_id
of the value I'm passing should be used as a starting point.Steps to reproduce:
Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.
_id
value being equal; and the same timestamp. For example "test12345" and test "test45678" with a timestamp of your choice.You'll see yielded results starting right after one of the documents having an
_id
starting with "test" yielded, however none of the documents exactly matches the given_id
.This has been quite a hassle for me and might not even be an undesired behaviour but it is surely not documented or at least it wasn't findable for me.
Provide logs (if relevant):
I'm not sure if this is really an issue but I'd like to know if this is an expected behaviour or rather a lacking documentation issue. Either way, let me know.
Thank you
The text was updated successfully, but these errors were encountered: