Skip to content

ES shard fail to recovery due do number of docs differ #12661

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kenny-ye opened this issue Aug 5, 2015 · 13 comments
Closed

ES shard fail to recovery due do number of docs differ #12661

kenny-ye opened this issue Aug 5, 2015 · 13 comments
Labels
:Core/Infra/Core Core issues without another label discuss feedback_needed

Comments

@kenny-ye
Copy link

kenny-ye commented Aug 5, 2015

We use two ES data nodes with setting -Xmx30g -Xms30g. The two ES servers have 128G physical memory and 24 CPU cores.
After bulking insert for about 2 weeks, totally we indexed about 10 million document (5 shards, 1 replica). The index size is about 4.7TB in disk. Then ES server comes into frequently long old GC and even after old GC, the heap occupation is above 20G.
I tried to release the heap usage by stopping the bulk insert and close the index. Then I open the index again, but the cluster goes to yellow. One replica shard failed to recovery because of document number mismatch with primary shard. How to solve this issue to recover the failed shard?

The server log:

[2015-08-05 03:30:26,461][WARN ][indices.cluster ] [ipattern-es01.iad1] [[ipattern-2015-07][3]] marking and sending shard failed due to [failed recovery]
org.elasticsearch.indices.recovery.RecoveryFailedException: [ipattern-2015-07][3]: Recovery failed from [ipattern-es02.iad1][CB3p3ZnZR_qUHGc05ZKK7Q][ipattern-es02.iad1][inet[/10.40.146.47:9300]] into [ipattern-es01.iad1][gqF0BDR_S9SAlSIfGZVx_g][ipattern-es01.iad1][inet[ipattern-es01.iad1/10.40.146.46:9300]]
at org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:280)
at org.elasticsearch.indices.recovery.RecoveryTarget.access$700(RecoveryTarget.java:70)
at org.elasticsearch.indices.recovery.RecoveryTarget$RecoveryRunner.doRun(RecoveryTarget.java:561)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.transport.RemoteTransportException: [ipattern-es02.iad1][inet[/10.40.146.47:9300]][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException: [ipattern-2015-07][3] Phase[1] Execution failed
at org.elasticsearch.index.engine.InternalEngine.recover(InternalEngine.java:898)
at org.elasticsearch.index.shard.IndexShard.recover(IndexShard.java:780)
at org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:125)
at org.elasticsearch.indices.recovery.RecoverySource.access$200(RecoverySource.java:49)
at org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:146)
at org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:132)
at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:279)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.indices.recovery.RecoverFilesRecoveryException: [ipattern-2015-07][3] Failed to transfer [0] files with total size of [0b]
at org.elasticsearch.indices.recovery.RecoverySourceHandler.phase1(RecoverySourceHandler.java:431)
at org.elasticsearch.index.engine.InternalEngine.recover(InternalEngine.java:893)
... 10 more
Caused by: java.lang.IllegalStateException: try to recover [ipattern-2015-07][3] from primary shard with sync id but number of docs differ: 2435163 (ipattern-es02.iad1, primary) vs 2435160(ipattern-es01.iad1)
at org.elasticsearch.indices.recovery.RecoverySourceHandler.phase1(RecoverySourceHandler.java:177)
... 11 more

@dakrone
Copy link
Member

dakrone commented Aug 5, 2015

@brwe any idea how the sync id can have differing numbers of documents?

@s1monw
Copy link
Contributor

s1monw commented Aug 5, 2015

are you using delete by query by any chance and can you tell me more about your usage pattern, the ES version and about the events that happened before you closed your index?

@clintongormley clintongormley added discuss feedback_needed :Core/Infra/Core Core issues without another label labels Aug 5, 2015
@kenny-ye
Copy link
Author

kenny-ye commented Aug 6, 2015

Before we close the index, the two nodes lost connection due to status ping timeout. The salve cannot connect the primary. The log is as below.

[2015-08-04 04:02:09,424][WARN ][transport ] [ipattern-es01.iad1] Received response for a request that has timed out, sent [55643ms] ago, timed out [25642ms] ago, action [internal:discovery/zen/fd/master_ping], node [[ipattern-es02.iad1][CB3p3ZnZR_qUHGc05ZKK7Q][ipattern-es02.iad1][inet[/10.40.146.47:9300]]], id [15173706]
[2015-08-04 04:27:30,530][INFO ][monitor.jvm ] [ipattern-es01.iad1] [gc][young][1253732][767265] duration [806ms], collections [1]/[1.6s], total [806ms]/[1.4d], memory [24.6gb]->[25.1gb]/[29.8gb], all_pools {[young] [133.7mb]->[8.1mb]/[1.1gb]}{[survivor] [149.7mb]->[149.7mb]/[149.7mb]}{[old] [24.4gb]->[24.9gb]/[28.5gb]}
[2015-08-04 05:57:01,795][INFO ][monitor.jvm ] [ipattern-es01.iad1] [gc][young][1258935][773567] duration [776ms], collections [1]/[1.3s], total [776ms]/[1.4d], memory [23.7gb]->[24.1gb]/[29.8gb], all_pools {[young] [1.4mb]->[1.6mb]/[1.1gb]}{[survivor] [149.7mb]->[149.7mb]/[149.7mb]}{[old] [23.6gb]->[24gb]/[28.5gb]}
[2015-08-04 06:20:45,932][ERROR][marvel.agent.exporter ] [ipattern-es01.iad1] error sending data to [http://[0:0:0:0:0:0:0:0]:9200/.marvel-2015.08.04/_bulk]: SocketTimeoutException[Read timed out]
[2015-08-04 06:21:12,690][INFO ][discovery.zen ] [ipattern-es01.iad1] master_left [[ipattern-es02.iad1][CB3p3ZnZR_qUHGc05ZKK7Q][ipattern-es02.iad1][inet[/10.40.146.47:9300]]], reason [failed to ping, tried [3] times, each with maximum [30s] timeout]
[2015-08-04 06:21:12,690][WARN ][discovery.zen ] [ipattern-es01.iad1] master left (reason = failed to ping, tried [3] times, each with maximum [30s] timeout), current nodes: {[ipattern-es01.iad1][gqF0BDR_S9SAlSIfGZVx_g][ipattern-es01.iad1][inet[ipattern-es01.iad1/10.40.146.46:9300]],}
[2015-08-04 06:21:12,690][INFO ][cluster.service ] [ipattern-es01.iad1] removed {[ipattern-es02.iad1][CB3p3ZnZR_qUHGc05ZKK7Q][ipattern-es02.iad1][inet[/10.40.146.47:9300]],}, reason: zen-disco-master_failed ([ipattern-es02.iad1][CB3p3ZnZR_qUHGc05ZKK7Q][ipattern-es02.iad1][inet[/10.40.146.47:9300]])
[2015-08-04 06:21:13,213][WARN ][action.bulk ] [ipattern-es01.iad1] failed to perform indices:data/write/bulk[s] on remote replica [ipattern-es02.iad1][CB3p3ZnZR_qUHGc05ZKK7Q][ipattern-es02.iad1][inet[/10.40.146.47:9300]][ipattern-2015-07][3]
org.elasticsearch.transport.NodeDisconnectedException: [ipattern-es02.iad1][inet[/10.40.146.47:9300]][indices:data/write/bulk[s][r]] disconnected
[2015-08-04 06:21:13,213][DEBUG][action.bulk ] [ipattern-es01.iad1] observer timed out. notifying listener. timeout setting [1m], time since start [1.4m]
[2015-08-04 06:21:13,213][WARN ][cluster.action.shard ] [ipattern-es01.iad1] can't send shard failed for [ipattern-2015-07][3], node[CB3p3ZnZR_qUHGc05ZKK7Q], [R], s[STARTED], no master known.
[2015-08-04 06:21:13,213][DEBUG][action.bulk ] [ipattern-es01.iad1] observer timed out. notifying listener. timeout setting [1m], time since start [1.4m]
[2015-08-04 06:21:13,213][DEBUG][action.bulk ] [ipattern-es01.iad1] observer timed out. notifying listener. timeout setting [1m], time since start [1.4m]
[2015-08-04 06:21:13,213][DEBUG][action.bulk ] [ipattern-es01.iad1] observer timed out. notifying listener. timeout setting [1m], time since start [1.6m]
[2015-08-04 06:21:13,213][DEBUG][action.bulk ] [ipattern-es01.iad1] observer timed out. notifying listener. timeout setting [1m], time since start [1.6m]
[2015-08-04 06:21:13,213][DEBUG][action.bulk ] [ipattern-es01.iad1] observer timed out. notifying listener. timeout setting [1m], time since start [1.4m]
[2015-08-04 06:21:13,213][DEBUG][action.bulk ] [ipattern-es01.iad1] observer timed out. notifying listener. timeout setting [1m], time since start [1.4m]
[2015-08-04 06:21:13,217][WARN ][action.bulk ] [ipattern-es01.iad1] failed to perform indices:data/write/bulk[s] on remote replica [ipattern-es02.iad1][CB3p3ZnZR_qUHGc05ZKK7Q][ipattern-es02.iad1][inet[/10.40.146.47:9300]][ipattern-2015-07][3]

@kenny-ye
Copy link
Author

kenny-ye commented Aug 6, 2015

And one more question. After indexing 12 million documents, even after old GC, the java heap usage is about 20G. Is it normal? Unlike common log files, our document is a litter complicated and average size is about 500K. And monitoring with Marval, I found the data under "Index Statistics->memory->LUCENE MEMORY" keeps on increasing. It is now about 30G. Does it related with Java heap occupation? What is the LUCENE MEMORY for? How to mitigate the memory issue?

We use Elasticsearch 1.6.0. Two data nodes with 5 shards and 1 replica. Now the total index size in each node is about 5TB.

@markwalkom
Copy link
Contributor

What does your cluster look like, how many nodes, are they in the same datacenter?

Also you're better off putting that second question on https://discuss.elastic.co :)

@kenny-ye
Copy link
Author

kenny-ye commented Aug 6, 2015

Hi markwalkom,

Thanks for your suggestion. I already submit it to https://discuss.elastic.co/t/large-index-size-cause-high-java-heap-occupation/26936

@brwe
Copy link
Contributor

brwe commented Aug 6, 2015

@kenny-ye Thanks a lot for the logs! This line:

[2015-08-04 06:21:13,213][WARN ][cluster.action.shard ] [ipattern-es01.iad1] can't send shard failed for [ipattern-2015-07][3], node[CB3p3ZnZR_qUHGc05ZKK7Q], [R], s[STARTED], no master known.

might indicate that you were running into #7572 . I assume elasticsearch issued a synced flush somewhen after primary and replica got out of sync. Synced flush does not check the number of documents in primary and replica before executing. It only makes sure that no indexing is going on while flushing. Such an inconsistency between primary and replica remains undetected until recovery starts.

To allocate the replica, can you try the following:

POST ipattern-es01.iad1/_flush?force

This will remove the sync id from the primary and make sure elasticsearch will not try to recover the files on the replica copy that misses the documents.
Then can you do

POST /_cluster/reroute

this should then cause the replica to be recovered from the primary which has the documents.

@clintongormley
Copy link
Contributor

No further info. Closing

@anhlqn
Copy link

anhlqn commented Feb 17, 2016

Ran into the same error with replica shards on one of my indexes. My fix was to change index setting to "index.number_of_replicas": 0, wait for the index to heal, then increase the number of replicas again.

@abraxxa
Copy link

abraxxa commented Feb 22, 2016

Just had the same problem after restarting one of my three nodes.
Thanks for the fix @anhlqn!

@nomoa
Copy link
Contributor

nomoa commented Jan 7, 2017

Happened to us as well on a 24 nodes cluster running 2.3.5, this cluster is receiving mostly bulk index requests. It looks like the problem happened after many nodes dropped out from the cluster caused by some network hiccups.
We use auto_expand_replica, as suggested by @anhlqn deleting corrupted replicas by setting 0-0 seemed to fix the issue.

@tomsommer
Copy link

tomsommer commented Jan 15, 2018

The replica fix is ridiculous though, there should be a better way to fix it, than to toggle replicas for all broken indexes. That a disk-full cluster can even lead to broken shards is odd in itself.

@bleskes
Copy link
Contributor

bleskes commented Jan 15, 2018

@tomsommer this is a 2 years old issue. I'm not sure what your issue is and which version it relates to but chances are high it's a different one. Can you please either open a new issue with a good reproduction / description or go to the discuss forums and post there so we can figure out what's going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label discuss feedback_needed
Projects
None yet
Development

No branches or pull requests