UpdateByQueryResponse throwing timeout #47

Praveen82 · 2015-11-04T05:44:50Z

Hi All,

I am Using "elasticsearch-action-updatebyquery"

Reference : https://github.com/yakaz/elasticsearch-action-updatebyquery4

API : The following api will do update bulk "Segment ids" to mached documents.

Example : segmentId= 50 needs to update on more then 20+ million documents.

Map scriptParams = new HashMap();
scriptParams.put("segmentexist", segId);
scriptParams.put("pgsegmentobject", pgSegmentIds);

UpdateByQueryClient updateByQueryClient = new UpdateByQueryClientWrapper(client);

UpdateByQueryResponse response = updateByQueryClient.prepareUpdateByQuery().setIndices(props.getProperty("index")).setTypes(props.getProperty("type"))
        .setTimeout(TimeValue.timeValueHours(24))
        .setIncludeBulkResponses(BulkResponseOption.ALL)

        .setScript("if (ctx._source.containsKey(\"pgSegmentIds\") ) { if (ctx._source.pgSegmentIds.contains(segmentexist) ) { ctx.op = \"none\" } else { ctx._source.pgSegmentIds += pgsegmentobject} } else { ctx._source.pgSegmentIds = pgsegmentobject }")
        .setScriptParams(scriptParams)


        .setQuery(query)

        .execute()
        .actionGet();

Its failing while update. I see the following exception.

2015-09-12 05:58:10 INFO transport:123 - [Moon Knight] failed to get local cluster state for [#transport#-1][ip-10-186-199-195][inet[localhost/10.31.48.47:9300]], disconnecting...
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[localhost/10.31.48.47:9300]][cluster/state] request_id [416] timed out after [5000ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2015-09-12 05:58:10 INFO transport:123 - [Moon Knight] failed to get local cluster state for [PGMonetize-ES04][bqljhciDQ4-Tr2dRAcbWtw][ip-10-31-48-47][inet[/10.31.48.47:9300]]{master=true}, disconnecting...
org.elasticsearch.transport.ReceiveTimeoutTransportException: [PGMonetize-ES04][inet[/10.31.48.47:9300]][cluster/state] request_id [423] timed out after [5001ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

I have done following setup

We are having 5 nodes, with 5 shards.
script.disable_dynamic: false
action.updatebyquery.bulk_size: 2500

still I get the above exception. Please help.

How to solve this issue and how to improve performance like ( Updating 20+ million record in <10 mins)

The text was updated successfully, but these errors were encountered:

Praveen82 · 2015-11-10T17:13:53Z

any update on this?

Praveen82 · 2015-11-10T17:15:57Z

is there any way to update 20+ million documents < 10 minutes?

Praveen82 mentioned this issue Nov 13, 2015

Not working on ES 2.0 #46

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

UpdateByQueryResponse throwing timeout #47

UpdateByQueryResponse throwing timeout #47

Praveen82 commented Nov 4, 2015

Praveen82 commented Nov 10, 2015

Uh oh!

Praveen82 commented Nov 10, 2015

Uh oh!

UpdateByQueryResponse throwing timeout #47

UpdateByQueryResponse throwing timeout #47

Comments

Praveen82 commented Nov 4, 2015

Praveen82 commented Nov 10, 2015

Uh oh!

Praveen82 commented Nov 10, 2015

Uh oh!