-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Search preference custom string is ignored when request is sent to a non-client node #53054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the report @raphaelchampeimont. This does indeed sound like surprising behaviour that disagrees with the documentation. You can use the profile API to determine exactly which shard copies are used by a particular search. I think this might help you come up with a way to reproduce this on a smaller index without needing to rely on discrepancies in the results. The expectation is that the same Do all the nodes have the same |
Pinging @elastic/es-search (:Search/Search) |
Almost. The amount of memory allocated is different. Here is the ansible configuration we use:
|
I will try to reproduce that on a minimalist cluster. |
Thanks, I see the issue. Coordinating nodes behave differently depending on whether elasticsearch/server/src/main/java/org/elasticsearch/cluster/routing/OperationRouting.java Lines 245 to 249 in 99ac2c0
The search respects the As a workaround in earlier versions, if you always use coordinating nodes that do not have |
Pinging @elastic/es-docs (>docs) |
I opened #74856 to backport the fix that was applied in 7.x, but I was told that adding new cluster settings is not something that's typically accepted. Unfortunately, our setup does not have client nodes and adding them just for this workaround would not be feasible. We are currently using the Given that it turned out there is no ES6 replacement for the |
Unless I misunderstood something, #46375 is merged in ES 7.5 so you could upgrade to ES7, set es.routing.search_ignore_awareness_attributes to true, and use a custom preference string based on a unique ID for each search like I explained in my first post (or use any ID which remains identical for a given search but still allows to balanced load on average, like a user ID, session ID, IP address...). And if later you upgrade to ES8, you could get rid of es.routing.search_ignore_awareness_attributes since it becomes the default behavior. |
The issue is that we cannot upgrade to ES7 while using the To add some more nuance, we are also using cross cluster searches and there will be a period of time in which both ES6 and ES7 versions will be used at the same time. |
If you use zone-aware shard allocation and have exactly as many zones as shard copies (for instance you have 2 zones and have replica=1 for all your indices), you could use _prefer_nodes, which exists in both ES6 and ES7, and list all nodes from one zone. This would force to use the shard copies in this zone, while keeping failover to the other shards if a prefered shard fails. It would however put the all the load on half of your cluster until you finish the migration to ES7 so it might not be OK for you from a performance point of view. If you don't use zone-based shard allocation, I would think the best option is to transform some of your nodes (2 for redundancy I would advise) into client nodes and send all requests to their IP addresses. If you have never used client nodes, don't worry, the CPU load on them will be low (the hard work is performed by the data nodes) so you don't need many of them like with data nodes. To give you an order of magnitude, on our cluster client nodes represent < 10% of the number of CPU cores in the cluster. Of course a last and rather obvious option would be to set replica = 0 on all your indices, deploy code in production that uses a preference string, then upgrade to ES7 and set back replica = 1 (or more). But unlike the other solutions above, this means temporarily losing redundancy, so this is probably not acceptable for you. |
When selecting replicas in a search, the coordinating node prefers nodes with the same shard allocation awareness attributes. So even if a search specifies the same custom preference values, different coordinating nodes may route it differently because of their awareness attributes. In 8.0, allocation awareness attributes no longer influence search replica selection. So although this is a bug, we do not intend to fix it in 7.x or 6.x. Instead, we document the behavior as a 'warning' and mention a system property that can be used to disable the behavior. Addresses #53054.
I updated the documentation in 6.8 and 7.x to warn about this issue (#83818). I'm going to close this out because we don't have further work planned. Thank you again for reporting this and for your flexibility in exploring workarounds. |
Elasticsearch version (
bin/elasticsearch --version
): 6.6.1Plugins installed:
JVM version (
java -version
):OS version (
uname -a
if on a Unix-like system):Description of the problem including expected versus actual behavior:
We have developed a legal search engine based on Elasticsearch and we are using the "preference" (https://www.elastic.co/guide/en/elasticsearch/reference/6.6/search-request-preference.html) parameter to ensure that the order of the results does not change:
Our cluster is composed of 2 zones (called A and B), each zone having:
The Elasticsearch cluster is aware of the zones (which correspond to AWS zones).
Our index has 12 primary shards and 1 replica, so each shard has exactly one copy in zone A and one copy in zone B (to be able to survive a complete zone outage).
We use a custom string as a preference (it is a basically a uuid generated for each unique user search, like 5d2a1ee677059bc926f7a39492f11a47).
Steps to reproduce:
When sending the query to client nodes, we get the expected behavior: The preference custom string is taken into account. Changing it yields different results. On the other hand, sending the request to the client node in zone A or B yields the same results if the preference custom string is the same. This is the expected behavior AFAIK.
However, when sending the request to a DATA node, we get an unexpected behavior which seems undocumented in https://www.elastic.co/guide/en/elasticsearch/reference/6.6/search-request-preference.html. Changing the preference custom string has no effect on the result. On the other hand, the result is different if the node to which we send the query is in zone A or B (but we get the same result for any node in zone A, and for any node in zone B).
Hypothesis: It seems that Elasticsearch is ignoring the preference custom string if the node to which we send the request is a data node, in which cases it chooses to use shards that belong the same zone, ignoring the preference custom string.
Note that this happens only with a custom string. When using "_primary" for instance, the behavior is the same regardless of the type of node to which we send the query.
This was an issue for us because we send request to 2 nodes (one in zone A and one in zone B) from our application (using the built-in Elasticsearch Node.js library mechanism to specify several hosts), and the results were inconsistent depending on the node to which the request was sent (probably a round robin?). To solve the issue, we added 2 client nodes to the cluster and send all our requests to these client nodes, and the preference custom string is now properly taken into account. However, the fact that custom string in preference only works when using CLIENT nodes in the cluster does not seem to be a documented behavior in https://www.elastic.co/guide/en/elasticsearch/reference/6.6/search-request-preference.html
So it seems to us that either this is a bug, or this a something missing in the documentation.
I am sorry that I don't have a simple reproducible scenario, but I don't know how to make an index for which I can immediately see differences between shards. For us it took several weeks for this issue to be revealed, as a "new" index gives the same results between all shards anyway.
Provide logs (if relevant):
The text was updated successfully, but these errors were encountered: