Occassionally, queries return results from the wrong index #198

clintongormley · 2010-05-28T08:36:24Z

Hiya

See the query below - I query the alias ia_object which points to the index ia_object_1274409377, for objects of type notice, but the resuts returned include results from my other index iannounce_object_1274399922 of types other than notice.

I'm wondering if one node has dispatched the query, then picked up the wrong resultset?

curl -XGET 'http://idb2:9200/ia_object/notice/_search?search_type=dfs_query_then_fetch'  -d '
{
   "sort" : [
      {
         "score" : "desc"
      }
   ],
   "fields" : [],
   "from" : 0,
   "query" : {
      "filtered" : {
         "filter" : {
            "bool" : {
               "must" : [
                  {
                     "term" : {
                        "sub_type" : "engagement"
                     }
                  },
                  {
                     "term" : {
                        "status" : "active"
                     }
                  },
                  {
                     "term" : {
                        "location_id" : "40"
                     }
                  }
               ]
            }
         },
         "query" : {
            "dis_max" : {
               "queries" : [
                  {
                     "query_string" : {
                        "fields" : [
                           "name"
                        ],
                        "boost" : 1,
                        "query" : "\"David Mcgeoch\"~7"
                     }
                  },
                  {
                     "filtered" : {
                        "filter" : {
                           "term" : {
                              "has_name" : "0"
                           }
                        },
                        "query" : {
                           "query_string" : {
                              "fields" : [
                                 "text"
                              ],
                              "boost" : 1.3,
                              "query" : "\"David Mcgeoch\"~7"
                           }
                        }
                     }
                  },
                  {
                     "query_string" : {
                        "boost" : 1,
                        "query" : "David Mcgeoch"
                     }
                  }
               ],
               "tie_breaker" : 0.7
            }
         }
      }
   },
   "size" : 300
}
'
# {
#    "hits" : {
#       "hits" : [
#          {
#             "_index" : "iannounce_object_1274399922",
#             "_id" : "3686897",
#             "_type" : "memorial"
#          },
#          {
#             "_index" : "ia_object_1274409377",
#             "_id" : "1336845",
#             "_type" : "notice"
#          },
#          {
#             "_index" : "iannounce_object_1274399922",
#             "_id" : "2081719",
#             "_type" : "memorial"
#          },
#          {
#             "_index" : "iannounce_object_1274399922",
#             "_id" : "384424",
#             "_type" : "memorial"
#          },
#          {
#             "_index" : "iannounce_object_1274399922",
#             "_id" : "234160",
#             "_type" : "memorial"
#          },
#          {
#             "_index" : "iannounce_object_1274399922",
#             "_id" : "238852",
#             "_type" : "memorial"
#          },

[ SNIPPED ]

#       ],
#       "total" : 7390
#    },
#    "_shards" : {
#       "failed" : 0,
#       "successful" : 5,
#       "total" : 5
#    }
# }

There were some errors in the ES log around that time:

[07:14:35,985][DEBUG][action.search.type       ] [Ringleader] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [Hulkling][inet[/192.168.10.50:9300]][search/phase/fetch/id]
Caused by: org.elasticsearch.search.SearchContextMissingException: No search context found for id [9785964], timed out
        at org.elasticsearch.search.SearchService.findContext(SearchService.java:274)
        at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:257)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:444)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:435)
        at org.elasticsearch.transport.netty.MessageChannelHandler$3.run(MessageChannelHandler.java:175)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
[07:14:35,986][DEBUG][action.search.type       ] [Ringleader] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [Hulkling][inet[/192.168.10.50:9300]][search/phase/fetch/id]
Caused by: org.elasticsearch.search.SearchContextMissingException: No search context found for id [9785961], timed out
        at org.elasticsearch.search.SearchService.findContext(SearchService.java:274)
        at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:257)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:444)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:435)
        at org.elasticsearch.transport.netty.MessageChannelHandler$3.run(MessageChannelHandler.java:175)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
[07:14:35,986][DEBUG][action.search.type       ] [Ringleader] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [Hulkling][inet[/192.168.10.50:9300]][search/phase/fetch/id]
Caused by: org.elasticsearch.search.SearchContextMissingException: No search context found for id [9785965], timed out
        at org.elasticsearch.search.SearchService.findContext(SearchService.java:274)
        at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:257)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:444)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:435)
        at org.elasticsearch.transport.netty.MessageChannelHandler$3.run(MessageChannelHandler.java:175)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
[07:14:35,991][DEBUG][action.search.type       ] [Ringleader] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [Hulkling][inet[/192.168.10.50:9300]][search/phase/fetch/id]
Caused by: org.elasticsearch.search.SearchContextMissingException: No search context found for id [9785963], timed out
        at org.elasticsearch.search.SearchService.findContext(SearchService.java:274)
        at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:257)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:444)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:435)
        at org.elasticsearch.transport.netty.MessageChannelHandler$3.run(MessageChannelHandler.java:175)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
[07:14:36,212][DEBUG][action.search.type       ] [Ringleader] Failed to execute fetch phase
org.elasticsearch.search.SearchContextMissingException: No search context found for id [4594371], timed out
        at org.elasticsearch.search.SearchService.findContext(SearchService.java:274)
        at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:257)
        at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:309)
        at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.executeFetch(TransportSearchDfsQueryThenFetchAction.java:226)
        at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$700(TransportSearchDfsQueryThenFetchAction.java:64)
        at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$4.run(TransportSearchDfsQueryThenFetchAction.java:199)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)

The text was updated successfully, but these errors were encountered:

kimchy · 2010-05-28T15:08:54Z

Hey,

Very, very strange. Lets start with the second failures. Is there a chance that you get a long search requests or load that is higher than 5 minutes? The default timeout between a query phase and a fetch phase is 5 minutes (or controlled by setting the keep alive parameter).

Second, regarding the first problem. This is very very strange. I think your assumption is correct, and for some reason, requests got mixed up. I don't see where it can happen in elasticsearch, but I will investigate further. It must be in the HTTP layer, and thats pretty straightforward. It might also be on the http client side..., trying to think how best to nail down where and why this happens.... (well, it can also be in the logging layer :) )

clintongormley · 2010-05-28T15:15:42Z

Re long search requests - I don't honestly know. I haven't seen that happening, but my app times out requests after 15 seconds, then retries, so it may well be hidden.

I'm certainly not seeing anything unusual regarding CPU or memory use in jvisualvm.

I think you'll need to add some more logging to nail it down. Also, this is v 0.7.1 - I haven't upgraded yet.

kimchy · 2010-05-28T20:34:12Z

When I said client side, I meant the perl http client, which might mix messages up, if that was not obvious :). I am trying to think of the best logging to add. I can generate a unique id for each rest request when it is received, and log the request, and then, log the response sent under that unique id, what do you think?

clintongormley · 2010-05-29T08:50:54Z

I think it is highly unlikely that it is happening on the Perl side. I'm using a core module that has been stable for many years - something as obvious as this would have been noticed before. Also, it is not an async request, so nothing to mix up there.

Re logging - you can add that if you like, but i think you're probably looking in the wrong place. The reason I posted the errors was that they happened at the same time as the bad results (i can't be certain that they were generated by the same request or not). I think it is likelier that the problem is node communication getting mixed up, while retrieving results from the other shards.

but what do I know :)

kimchy · 2010-05-29T14:09:39Z

I agree with you, the perl side is probably not to blame.

So, some background. You execute a dfs query then fetch. Thats three phases, first, the dfs, that goes to all shards. Then, query, that goes to all shards with the dfs result. Last, the fetch, that goes to the relevant shards.

A certain shard communication might have got messes up, which I am trying to understand if can happen. In your response (if you still have all of it) does it look like some of the response are correct, while others are wrong?

If the response looks like simply a response for a another query, then its more high level then that, if its a mixed result, with some good and some bad, then its a messed up specific shard communication in the search.

When do you plan to upgrade to 0.8? It will be easier to help with that version.

clintongormley · 2010-05-29T14:33:42Z

I don't have the full results any more, but looking at the above, the second hit matches the query correctly, but all the others (i think it was all) were from the wrong index.

Hopefully I'll be on 0.8 next weekend. It won't be this weekend.

kimchy · 2010-05-29T17:04:58Z

Yea, missed the second hit. Ok, I will look into this more try and see why this might happen. Ping me when you upgrade to 0.8.

clintongormley · 2010-06-11T09:11:55Z

Bad news, kimchy - i'm still getting this error on v 0.8

kimchy · 2010-06-11T19:57:32Z

Do you still get the exceptions in the logs?

clintongormley · 2010-06-12T11:33:31Z

There are still exceptions in the logs, but the times don't coincide, so I'm not sure that they're related:

[02:20:53,165][DEBUG][action.search.type       ] [Fontanelle] [12798594] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [Blue Bullet][inet[/192.168.10.41:9300]][search/phase/fetch/id]
Caused by: org.elasticsearch.search.SearchContextMissingException: No search context found for id [12798594], timed out
    at org.elasticsearch.search.SearchService.findContext(SearchService.java:274)
    at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:257)
    at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:444)
    at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:435)
    at org.elasticsearch.transport.netty.MessageChannelHandler$3.run(MessageChannelHandler.java:175)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
[02:41:38,114][DEBUG][action.search.type       ] [Fontanelle] [6167654] Failed to execute fetch phase
org.elasticsearch.search.SearchContextMissingException: No search context found for id [6167654], timed out
    at org.elasticsearch.search.SearchService.findContext(SearchService.java:274)
    at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:257)
    at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:309)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.executeFetch(TransportSearchDfsQueryThenFetchAction.java:226)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$700(TransportSearchDfsQueryThenFetchAction.java:64)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$4.run(TransportSearchDfsQueryThenFetchAction.java:199)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
[02:41:38,114][DEBUG][action.search.type       ] [Fontanelle] [6167652] Failed to execute fetch phase
org.elasticsearch.search.SearchContextMissingException: No search context found for id [6167652], timed out
    at org.elasticsearch.search.SearchService.findContext(SearchService.java:274)
    at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:257)
    at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:309)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.executeFetch(TransportSearchDfsQueryThenFetchAction.java:226)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$700(TransportSearchDfsQueryThenFetchAction.java:64)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$4.run(TransportSearchDfsQueryThenFetchAction.java:199)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
[02:41:38,115][DEBUG][action.search.type       ] [Fontanelle] [6167653] Failed to execute fetch phase
org.elasticsearch.search.SearchContextMissingException: No search context found for id [6167653], timed out
    at org.elasticsearch.search.SearchService.findContext(SearchService.java:274)
    at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:257)
    at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:309)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.executeFetch(TransportSearchDfsQueryThenFetchAction.java:226)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$700(TransportSearchDfsQueryThenFetchAction.java:64)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$4.run(TransportSearchDfsQueryThenFetchAction.java:199)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
[02:41:38,160][DEBUG][action.search.type       ] [Fontanelle] [12820997] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [Blue Bullet][inet[/192.168.10.41:9300]][search/phase/fetch/id]
Caused by: org.elasticsearch.search.SearchContextMissingException: No search context found for id [12820997], timed out
    at org.elasticsearch.search.SearchService.findContext(SearchService.java:274)
    at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:257)
    at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:444)
    at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:435)
    at org.elasticsearch.transport.netty.MessageChannelHandler$3.run(MessageChannelHandler.java:175)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
[03:38:57,733][DEBUG][action.search.type       ] [Fontanelle] [12872042] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [Blue Bullet][inet[/192.168.10.41:9300]][search/phase/fetch/id]
Caused by: java.lang.NullPointerException
[03:40:05,941][DEBUG][action.search.type       ] [Fontanelle] [6193567] Failed to execute query phase
org.elasticsearch.search.query.QueryPhaseExecutionException: [iannounce_object_1275333493][0]: query[filtered((name:"pamela goody"~7 | filtered(short_text:"pamela goody"~7^1.3)->FilterCacheFilterWrapper(org.elasticsearch.util.lucene.search.TermFilter@3e3f4f7) | (_all:pamela _all:goody))~0.7)->FilterCacheFilterWrapper(BooleanFilter( +FilterCacheFilterWrapper(org.elasticsearch.util.lucene.search.TermFilter@33d4352b) +FilterCacheFilterWrapper(org.elasticsearch.util.lucene.search.TermFilter@3a1bb115) +FilterCacheFilterWrapper(org.elasticsearch.util.lucene.search.TermFilter@40ea8cf6) +FilterCacheFilterWrapper(parent_id:[1 TO 1])))],from[0],size[300],sort[<score>]: Query Failed []
    at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:103)
    at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:193)
    at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:159)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:141)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$200(TransportSearchDfsQueryThenFetchAction.java:64)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$1.run(TransportSearchDfsQueryThenFetchAction.java:114)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.IllegalArgumentException: df for term pamela not available
    at org.elasticsearch.search.dfs.CachedDfSource.docFreq(CachedDfSource.java:51)
    at org.apache.lucene.search.Similarity.idfExplain(Similarity.java:769)
    at org.apache.lucene.search.PhraseQuery$PhraseWeight.<init>(PhraseQuery.java:122)
    at org.apache.lucene.search.PhraseQuery.createWeight(PhraseQuery.java:264)
    at org.apache.lucene.search.DisjunctionMaxQuery$DisjunctionMaxWeight.<init>(DisjunctionMaxQuery.java:107)
    at org.apache.lucene.search.DisjunctionMaxQuery.createWeight(DisjunctionMaxQuery.java:184)
    at org.apache.lucene.search.FilteredQuery.createWeight(FilteredQuery.java:63)
    at org.apache.lucene.search.FilteredQuery.createWeight(FilteredQuery.java:63)
    at org.apache.lucene.search.Query.weight(Query.java:101)
    at org.elasticsearch.search.internal.ContextIndexSearcher.createWeight(ContextIndexSearcher.java:78)
    at org.apache.lucene.search.Searcher.search(Searcher.java:49)
    at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:97)
    ... 8 more
[03:52:03,125][DEBUG][action.search.type       ] [Fontanelle] [12881061] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [Blue Bullet][inet[/192.168.10.41:9300]][search/phase/fetch/id]
Caused by: java.lang.ArrayIndexOutOfBoundsException
[03:52:03,142][DEBUG][action.search.type       ] [Fontanelle] [12881063] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [Blue Bullet][inet[/192.168.10.41:9300]][search/phase/fetch/id]
Caused by: org.elasticsearch.search.SearchContextMissingException: No search context found for id [12881063], timed out
    at org.elasticsearch.search.SearchService.findContext(SearchService.java:274)
    at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:257)
    at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:444)
    at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:435)
    at org.elasticsearch.transport.netty.MessageChannelHandler$3.run(MessageChannelHandler.java:175)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
[13:15:55,961][DEBUG][index.gateway            ] [Fontanelle][iannounce_object_1275333493][3] Snapshotting on close ...
[13:15:56,470][DEBUG][index.shard.recovery     ] [Fontanelle][iannounce_object_1275333493][3] Starting recovery from [Blue Bullet][b6b87b7e-89da-4644-b17a-2fb638b16041][inet[/192.168.10.41:9300]]
[13:16:50,518][DEBUG][index.shard.recovery     ] [Fontanelle][iannounce_object_1275333493][3] Recovery completed from [Blue Bullet][b6b87b7e-89da-4644-b17a-2fb638b16041][inet[/192.168.10.41:9300]], took[54s], throttling_wait [0s]
   Phase1: recovered [27] files with total size of [1.2g], took [53.5s], throttling_wait [0s]
   Phase2: recovered [1644] transaction log operations, took [390ms]
   Phase3: recovered [0] transaction log operations, took [65ms]
[22:34:33,612][DEBUG][action.search.type       ] [Fontanelle] [14161052] Failed to execute query phase
org.elasticsearch.transport.RemoteTransportException: [Blue Bullet][inet[/192.168.10.41:9300]][search/phase/query/id]
Caused by: org.elasticsearch.search.SearchContextMissingException: No search context found for id [14161052], timed out
    at org.elasticsearch.search.SearchService.findContext(SearchService.java:274)
    at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:185)
    at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:388)
    at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:379)
    at org.elasticsearch.transport.netty.MessageChannelHandler$3.run(MessageChannelHandler.java:175)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
[22:34:33,630][DEBUG][action.search.type       ] [Fontanelle] [6837580] Failed to execute fetch phase
org.elasticsearch.search.SearchContextMissingException: No search context found for id [6837580], timed out
    at org.elasticsearch.search.SearchService.findContext(SearchService.java:274)
    at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:257)
    at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:309)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.executeFetch(TransportSearchDfsQueryThenFetchAction.java:226)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$700(TransportSearchDfsQueryThenFetchAction.java:64)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$4.run(TransportSearchDfsQueryThenFetchAction.java:199)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
[22:34:33,630][DEBUG][action.search.type       ] [Fontanelle] [6837581] Failed to execute fetch phase
org.elasticsearch.search.SearchContextMissingException: No search context found for id [6837581], timed out
    at org.elasticsearch.search.SearchService.findContext(SearchService.java:274)
    at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:257)
    at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:309)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.executeFetch(TransportSearchDfsQueryThenFetchAction.java:226)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$700(TransportSearchDfsQueryThenFetchAction.java:64)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$4.run(TransportSearchDfsQueryThenFetchAction.java:199)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
[22:34:33,631][DEBUG][action.search.type       ] [Fontanelle] [6837579] Failed to execute fetch phase
org.elasticsearch.search.SearchContextMissingException: No search context found for id [6837579], timed out
    at org.elasticsearch.search.SearchService.findContext(SearchService.java:274)
    at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:257)
    at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:309)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.executeFetch(TransportSearchDfsQueryThenFetchAction.java:226)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$700(TransportSearchDfsQueryThenFetchAction.java:64)
    at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$4.run(TransportSearchDfsQueryThenFetchAction.java:199)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
[22:34:33,720][DEBUG][action.search.type       ] [Fontanelle] [14161053] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [Blue Bullet][inet[/192.168.10.41:9300]][search/phase/fetch/id]
Caused by: org.elasticsearch.search.SearchContextMissingException: No search context found for id [14161053], timed out
    at org.elasticsearch.search.SearchService.findContext(SearchService.java:274)
    at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:257)
    at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:444)
    at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:435)
    at org.elasticsearch.transport.netty.MessageChannelHandler$3.run(MessageChannelHandler.java:175)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
[22:34:33,722][DEBUG][action.search.type       ] [Fontanelle] [14161054] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [Blue Bullet][inet[/192.168.10.41:9300]][search/phase/fetch/id]
Caused by: org.elasticsearch.search.SearchContextMissingException: No search context found for id [14161054], timed out
    at org.elasticsearch.search.SearchService.findContext(SearchService.java:274)
    at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:257)
    at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:444)
    at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:435)
    at org.elasticsearch.transport.netty.MessageChannelHandler$3.run(MessageChannelHandler.java:175)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)

clintongormley · 2011-05-20T14:26:04Z

In 0.16.2, an occurrence of this problem (ie ES returning the wrong results) coincided with this error in the logs:

Caught exception while handling client http traffic, closing connection
java.lang.IllegalStateException: cannot send more responses than requests

Full logs https://gist.github.com/85a707a6edf8e8534b4a

clintongormley · 2011-07-27T16:59:15Z

Fixed in issue #1152

…198) This PR adds the basic infrastructure for turning a physical plan into a list of drivers that can (locally) execute the given physical plan. It adds the PlanNode class to represent a physical plan (which is just a tree / digraph of PlanNode objects). The PR assumes that this physical plan makes sense (i.e. it does not do any kind of extra validation). It then implements a LocalExecutionPlanner to turn the given plan into a list of drivers, allowing parallel execution of the given plan (in-so-far as parallelism has been designed into the plan). It covers all the parallel executions explored as part of #189, showing the flexibility of the planner.

Since February 7 we noticed an increase in the 100th percentile service time of the `country_agg_cached` query in geonames when the trial license is enabled. In order to dig whether this is related to JVM activity (GC, safepoints) we temporarily enable GC logs. This is expected to be low-impact on performance but allows us to dig deeper. Relates elastic#181

clintongormley closed this as completed Jul 27, 2011

bluelu mentioned this issue Dec 11, 2014

Node is not responsive after the end of a big merge for close to 10 minutes #8905

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Occassionally, queries return results from the wrong index #198

Occassionally, queries return results from the wrong index #198

clintongormley commented May 28, 2010

kimchy commented May 28, 2010

clintongormley commented May 28, 2010

kimchy commented May 28, 2010

clintongormley commented May 29, 2010

kimchy commented May 29, 2010

clintongormley commented May 29, 2010

kimchy commented May 29, 2010

clintongormley commented Jun 11, 2010

kimchy commented Jun 11, 2010

clintongormley commented Jun 12, 2010

clintongormley commented May 20, 2011

clintongormley commented Jul 27, 2011

Occassionally, queries return results from the wrong index #198

Occassionally, queries return results from the wrong index #198

Comments

clintongormley commented May 28, 2010

kimchy commented May 28, 2010

clintongormley commented May 28, 2010

kimchy commented May 28, 2010

clintongormley commented May 29, 2010

kimchy commented May 29, 2010

clintongormley commented May 29, 2010

kimchy commented May 29, 2010

clintongormley commented Jun 11, 2010

kimchy commented Jun 11, 2010

clintongormley commented Jun 12, 2010

clintongormley commented May 20, 2011

clintongormley commented Jul 27, 2011