Skip to content

NullPointerExceptions when flushing an index #25

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
clintongormley opened this issue Feb 18, 2010 · 6 comments
Closed

NullPointerExceptions when flushing an index #25

clintongormley opened this issue Feb 18, 2010 · 6 comments

Comments

@clintongormley
Copy link
Contributor

Hiya

I'm running a test suite against a local server, started as:

./bin/elasticsearch -f

I'm getting NullPointerExceptions if I create an index, flush the indices, then delete the index, without sleeping after the flush. Shouldn't flush/refresh/optimize etc block until the action is complete?

Or is there some way of asking the cluster: are you ready now?

ta

clint
> ./bin/elasticsearch -f
[15:04:00,872][INFO ][server ] [Doe, John] {ElasticSearch/0.5.0/2010-02-18T13:42:47/dev}: Initializing ...
[15:04:02,663][INFO ][server ] [Doe, John] {ElasticSearch/0.5.0/2010-02-18T13:42:47/dev}: Initialized
[15:04:02,663][INFO ][server ] [Doe, John] {ElasticSearch/0.5.0/2010-02-18T13:42:47/dev}: Starting ...
[15:04:02,755][INFO ][transport ] [Doe, John] boundAddress [inet[/0.0.0.0:9300]], publishAddress [inet[/127.0.0.2:9300]]
[15:04:02,771][WARN ][jgroups.UDP ] send buffer of socket java.net.DatagramSocket@1faac07d was set to 640KB, but the OS only allocated 131.07KB. This might lead to performance problems. Please set your max send buffer in the OS correctly (e.g. net.core.wmem_max on Linux)
[15:04:02,771][WARN ][jgroups.UDP ] receive buffer of socket java.net.DatagramSocket@1faac07d was set to 20MB, but the OS only allocated 131.07KB. This might lead to performance problems. Please set your max receive buffer in the OS correctly (e.g. net.core.rmem_max on Linux)
[15:04:02,771][WARN ][jgroups.UDP ] send buffer of socket java.net.MulticastSocket@2259a735 was set to 640KB, but the OS only allocated 131.07KB. This might lead to performance problems. Please set your max send buffer in the OS correctly (e.g. net.core.wmem_max on Linux)
[15:04:02,771][WARN ][jgroups.UDP ] receive buffer of socket java.net.MulticastSocket@2259a735 was set to 25MB, but the OS only allocated 131.07KB. This might lead to performance problems. Please set your max receive buffer in the OS correctly (e.g. net.core.rmem_max on Linux)
[15:04:04,814][INFO ][cluster ] [Doe, John] New Master [Doe, John][getafix-2590][data][inet[/127.0.0.2:9300]]
[15:04:04,882][INFO ][discovery ] [Doe, John] elasticsearch/getafix-2590
[15:04:04,901][INFO ][http ] [Doe, John] boundAddress [inet[/0.0.0.0:9200]], publishAddress [inet[/127.0.0.2:9200]]
[15:04:05,140][INFO ][jmx ] [Doe, John] boundAddress [service:jmx:rmi:///jndi/rmi://:9400/jmxrmi], publishAddress [service:jmx:rmi:///jndi/rmi://127.0.0.2:9400/jmxrmi]
[15:04:05,140][INFO ][server ] [Doe, John] {ElasticSearch/0.5.0/2010-02-18T13:42:47/dev}: Started
[15:04:06,516][INFO ][cluster.metadata ] [Doe, John] Creating Index [es_test_2], shards [3]/[1]
[15:04:06,992][INFO ][cluster.metadata ] [Doe, John] Deleting index [es_test_2]
Exception in thread "elasticsearch[Doe, John]clusterService#updateTask-pool-6-thread-1" java.lang.NullPointerException
at org.elasticsearch.cluster.action.shard.ShardStateAction$4.execute(ShardStateAction.java:129)
at org.elasticsearch.cluster.DefaultClusterService$2.run(DefaultClusterService.java:161)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Exception in thread "elasticsearch[Doe, John]clusterService#updateTask-pool-6-thread-2" java.lang.NullPointerException
at org.elasticsearch.cluster.action.shard.ShardStateAction$4.execute(ShardStateAction.java:129)
at org.elasticsearch.cluster.DefaultClusterService$2.run(DefaultClusterService.java:161)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Exception in thread "elasticsearch[Doe, John]clusterService#updateTask-pool-6-thread-3" java.lang.NullPointerException
at org.elasticsearch.cluster.action.shard.ShardStateAction$4.execute(ShardStateAction.java:129)
at org.elasticsearch.cluster.DefaultClusterService$2.run(DefaultClusterService.java:161)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

Test script:

curl -XGET http://127.0.0.1:9200/_cluster/nodes
# {
#    "clusterName" : "elasticsearch",
#    "nodes" : {
#       "getafix-2590" : {
#          "httpAddress" : "inet[/127.0.0.2:9200]",
#          "dataNode" : true,
#          "transportAddress" : "inet[getafix.traveljury.com/127.0.
# >         0.2:9300]",
#          "name" : "Doe, John"
#       }
#    }
# }


curl -XPUT http://127.0.0.2:9200/es_test_2/ -d '{
   "index" : {
      "numberOfReplicas" : 1,
      "numberOfShards" : 3
   }
}
'
# {
#    "ok" : true
# }


curl -XPOST http://127.0.0.2:9200/_flush
# {
#    "ok" : true,
#    "_shards" : {
#       "failed" : 6,
#       "successful" : 0,
#       "total" : 6
#    }
# }


curl -XDELETE http://127.0.0.2:9200/es_test_2/
# {
#    "ok" : true
# }


curl -XGET http://127.0.0.2:9200/es_test_2/_status
# {
#    "debug" : {
#       "at" : {
#          "className" : "java.lang.Thread",
#          "methodName" : "run",
#          "fileName" : "Thread.java",
#          "lineNumber" : 619
#       },
#       "message" : "Index[es_test_2] missing"
#    },
#    "error" : "Index[es_test_2] missing"
# }
@kimchy
Copy link
Member

kimchy commented Feb 18, 2010

The exception on the server side is a bug (well, actually, nothing bad happened, but it should not be printed out).

I agree regarding the ability to "wait" for an index to get instantiated on the nodes it can. It might be a flag on the create index, or another API (localhost:9200/test/_wait or something).

One last question, I added the debug element to APIs that fail. I know, I have a whole lot of work at getting exception proper and all, but for now, do you think that by default, should it return the debug element, or just the message (9 out of 10 times, the message is enough)?

@clintongormley
Copy link
Contributor Author

"nothing bad happened"... hmm, my cluster would hang for ages - maybe it recovers in the end? i didn't wait.

re debug element vs message - perhaps have a server startup flag? the debug messages are only really helpful to you, so to have them enabled by default would just give more crud to the user. but you want to be able to ask them for the debug output, so it should be easy to switch that on.

@kimchy
Copy link
Member

kimchy commented Feb 19, 2010

Really, it got stuck? Strange, I will have a look at why it might happen. Have you tested it with the new version and it is fixed?

I will add a flag that is called traceError that will print it. By default, I will turn it off.

@clintongormley
Copy link
Contributor Author

Here's an example of how it hangs (latest version):

curl -XGET 'http://127.0.0.1:9200/_cluster/nodes' 
# {
#    "clusterName" : "elasticsearch",
#    "nodes" : {
#       "getafix-55022" : {
#          "httpAddress" : "inet[/127.0.0.2:9201]",
#          "dataNode" : true,
#          "transportAddress" : "inet[getafix.traveljury.com/127.0.
# >          0.2:9301]",
#          "name" : "Talbot, Glenn"
#       },
#       "getafix-15896" : {
#          "httpAddress" : "inet[/127.0.0.2:9200]",
#          "dataNode" : true,
#          "transportAddress" : "inet[getafix.traveljury.com/127.0.
# >          0.2:9300]",
#          "name" : "Monroe, Trip"
#       },
#       "getafix-61392" : {
#          "httpAddress" : "inet[/127.0.0.2:9202]",
#          "dataNode" : true,
#          "transportAddress" : "inet[getafix.traveljury.com/127.0.
# >          0.2:9302]",
#          "name" : "Puck"
#       }
#    }
# }


curl -XPUT 'http://127.0.0.2:9201/es_test/'  -d '
{}
'
# {
#    "ok" : true
# }


curl -XPUT 'http://127.0.0.2:9201/es_test/'  -d '
{}
'
# {
#    "error" : "Index[es_test] Already exists"
# }


curl -XPUT 'http://127.0.0.2:9201/es_test_2/'  -d '
{
   "index" : {
      "numberOfReplicas" : 1,
      "numberOfShards" : 3
   }
}
'
# {
#    "ok" : true
# }


curl -XPOST 'http://127.0.0.2:9201/_flush?refresh=true' 
# {
#    "ok" : true,
#    "_shards" : {
#       "failed" : 15,
#       "successful" : 1,
#       "total" : 16
#    }
# }


curl -XDELETE 'http://127.0.0.2:9201/es_test_2/' 
# {
#    "ok" : true
# }


curl -XPUT 'http://127.0.0.2:9201/es_test_2/'  -d '
{}
'
# 500 Server closed connection without sending any data back

Then, when I shutdown the servers, I get:

[14:31:52,707][INFO ][server                   ] [Talbot, Glenn] {ElasticSearch/0.5.0/2010-02-19T12:32:15/dev}: Closing ...
[14:32:02,714][INFO ][server                   ] [Talbot, Glenn] {ElasticSearch/0.5.0/2010-02-19T12:32:15/dev}: Closed
[14:32:02,736][WARN ][indices.cluster          ] [Talbot, Glenn] Failed to start shard for index [es_test_2] and shard id [0]
org.elasticsearch.index.shard.recovery.RecoveryFailedException: Index Shard [es_test_2][0]: Recovery failed from [Monroe, Trip][getafix-15896][data][inet[getafix.traveljury.com/127.0.0.2:9300]] into [Talbot, Glenn][getafix-55022][data][inet[getafix.traveljury.com/127.0.0.2:9301]]
    at org.elasticsearch.index.shard.recovery.RecoveryAction.startRecovery(RecoveryAction.java:154)
    at org.elasticsearch.indices.cluster.IndicesClusterStateService$3.run(IndicesClusterStateService.java:325)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
Caused by: org.elasticsearch.ElasticSearchInterruptedException
    at org.elasticsearch.transport.PlainTransportFuture.txGet(PlainTransportFuture.java:97)
    at org.elasticsearch.transport.PlainTransportFuture.txGet(PlainTransportFuture.java:34)
    at org.elasticsearch.index.shard.recovery.RecoveryAction.startRecovery(RecoveryAction.java:124)
    ... 4 more
[14:32:02,741][WARN ][cluster.action.shard     ] [Talbot, Glenn] Sending failed shard for [es_test_2][0], Node[getafix-55022], [B], S[INITIALIZING]
[14:32:02,742][WARN ][indices.cluster          ] [Talbot, Glenn] Failed to mark shard as failed after a failed start for index [es_test_2] and shard id [0]
org.elasticsearch.index.shard.recovery.RecoveryFailedException: Index Shard [es_test_2][0]: Recovery failed from [Monroe, Trip][getafix-15896][data][inet[getafix.traveljury.com/127.0.0.2:9300]] into [Talbot, Glenn][getafix-55022][data][inet[getafix.traveljury.com/127.0.0.2:9301]]
    at org.elasticsearch.index.shard.recovery.RecoveryAction.startRecovery(RecoveryAction.java:154)
    at org.elasticsearch.indices.cluster.IndicesClusterStateService$3.run(IndicesClusterStateService.java:325)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
Caused by: org.elasticsearch.ElasticSearchInterruptedException
    at org.elasticsearch.transport.PlainTransportFuture.txGet(PlainTransportFuture.java:97)
    at org.elasticsearch.transport.PlainTransportFuture.txGet(PlainTransportFuture.java:34)
    at org.elasticsearch.index.shard.recovery.RecoveryAction.startRecovery(RecoveryAction.java:124)
    ... 4 more

And...

[14:31:45,972][INFO ][server                   ] [Monroe, Trip] {ElasticSearch/0.5.0/2010-02-19T12:32:15/dev}: Closing ...
[14:31:55,981][INFO ][server                   ] [Monroe, Trip] {ElasticSearch/0.5.0/2010-02-19T12:32:15/dev}: Closed
[14:31:55,985][WARN ][indices.cluster          ] [Monroe, Trip] Failed to start shard for index [es_test] and shard id [4]
org.elasticsearch.index.shard.recovery.RecoveryFailedException: Index Shard [es_test][4]: Recovery failed from [Talbot, Glenn][getafix-55022][data][inet[getafix.traveljury.com/127.0.0.2:9301]] into [Monroe, Trip][getafix-15896][data][inet[getafix.traveljury.com/127.0.0.2:9300]]
    at org.elasticsearch.index.shard.recovery.RecoveryAction.startRecovery(RecoveryAction.java:154)
    at org.elasticsearch.indices.cluster.IndicesClusterStateService$3.run(IndicesClusterStateService.java:325)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
Caused by: org.elasticsearch.ElasticSearchInterruptedException
    at org.elasticsearch.transport.PlainTransportFuture.txGet(PlainTransportFuture.java:97)
    at org.elasticsearch.transport.PlainTransportFuture.txGet(PlainTransportFuture.java:34)
    at org.elasticsearch.index.shard.recovery.RecoveryAction.startRecovery(RecoveryAction.java:124)
    ... 4 more
[14:31:55,988][WARN ][cluster.action.shard     ] [Monroe, Trip] Sending failed shard for [es_test][4], Node[getafix-15896], [B], S[INITIALIZING]
[14:31:55,989][WARN ][indices.cluster          ] [Monroe, Trip] Failed to mark shard as failed after a failed start for index [es_test] and shard id [4]
org.elasticsearch.index.shard.recovery.RecoveryFailedException: Index Shard [es_test][4]: Recovery failed from [Talbot, Glenn][getafix-55022][data][inet[getafix.traveljury.com/127.0.0.2:9301]] into [Monroe, Trip][getafix-15896][data][inet[getafix.traveljury.com/127.0.0.2:9300]]
    at org.elasticsearch.index.shard.recovery.RecoveryAction.startRecovery(RecoveryAction.java:154)
    at org.elasticsearch.indices.cluster.IndicesClusterStateService$3.run(IndicesClusterStateService.java:325)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
Caused by: org.elasticsearch.ElasticSearchInterruptedException
    at org.elasticsearch.transport.PlainTransportFuture.txGet(PlainTransportFuture.java:97)
    at org.elasticsearch.transport.PlainTransportFuture.txGet(PlainTransportFuture.java:34)
    at org.elasticsearch.index.shard.recovery.RecoveryAction.startRecovery(RecoveryAction.java:124)
    ... 4 more
[14:31:55,985][WARN ][indices.cluster          ] [Monroe, Trip] Failed to start shard for index [es_test] and shard id [1]
org.elasticsearch.index.shard.recovery.RecoveryFailedException: Index Shard [es_test][1]: Recovery failed from [Talbot, Glenn][getafix-55022][data][inet[getafix.traveljury.com/127.0.0.2:9301]] into [Monroe, Trip][getafix-15896][data][inet[getafix.traveljury.com/127.0.0.2:9300]]
    at org.elasticsearch.index.shard.recovery.RecoveryAction.startRecovery(RecoveryAction.java:154)
    at org.elasticsearch.indices.cluster.IndicesClusterStateService$3.run(IndicesClusterStateService.java:325)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
Caused by: org.elasticsearch.ElasticSearchInterruptedException
    at org.elasticsearch.transport.PlainTransportFuture.txGet(PlainTransportFuture.java:97)
    at org.elasticsearch.transport.PlainTransportFuture.txGet(PlainTransportFuture.java:34)
    at org.elasticsearch.index.shard.recovery.RecoveryAction.startRecovery(RecoveryAction.java:124)
    ... 4 more
[14:31:55,991][WARN ][cluster.action.shard     ] [Monroe, Trip] Sending failed shard for [es_test][1], Node[getafix-15896], [B], S[INITIALIZING]
[14:31:55,991][WARN ][indices.cluster          ] [Monroe, Trip] Failed to mark shard as failed after a failed start for index [es_test] and shard id [1]
org.elasticsearch.index.shard.recovery.RecoveryFailedException: Index Shard [es_test][1]: Recovery failed from [Talbot, Glenn][getafix-55022][data][inet[getafix.traveljury.com/127.0.0.2:9301]] into [Monroe, Trip][getafix-15896][data][inet[getafix.traveljury.com/127.0.0.2:9300]]
    at org.elasticsearch.index.shard.recovery.RecoveryAction.startRecovery(RecoveryAction.java:154)
    at org.elasticsearch.indices.cluster.IndicesClusterStateService$3.run(IndicesClusterStateService.java:325)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
Caused by: org.elasticsearch.ElasticSearchInterruptedException
    at org.elasticsearch.transport.PlainTransportFuture.txGet(PlainTransportFuture.java:97)
    at org.elasticsearch.transport.PlainTransportFuture.txGet(PlainTransportFuture.java:34)
    at org.elasticsearch.index.shard.recovery.RecoveryAction.startRecovery(RecoveryAction.java:124)
    ... 4 more

@kimchy
Copy link
Member

kimchy commented Feb 28, 2010

can you give this a go again? I fixed some things relating to that and I think this will be fixed as well.

@clintongormley
Copy link
Contributor Author

Fixed ++

dadoonet added a commit that referenced this issue Jun 5, 2015
Closes #25.
(cherry picked from commit 17d43f5)
dadoonet added a commit that referenced this issue Jun 5, 2015
Closes #25.
(cherry picked from commit 8836775)
dadoonet added a commit that referenced this issue Jun 5, 2015
Closes #25.

(cherry picked from commit c299ce4)
dadoonet added a commit that referenced this issue Jun 5, 2015
Latest Azure Java SDK is 0.4.6 when looking at http://search.maven.org/#browse%7C-589510877

Closes #25.

(cherry picked from commit 89be7b4)
dadoonet pushed a commit that referenced this issue Jun 5, 2015
johtani added a commit that referenced this issue Jun 5, 2015
dadoonet added a commit that referenced this issue Jun 9, 2015
ywelsch pushed a commit to ywelsch/elasticsearch that referenced this issue Apr 24, 2018
Today, we throw an overly-general IllegalArgumentException to reject
unacceptable messages in the consensus layer. We expect messages to be rejected
from time to time, if delayed or some other failure occurs, and it is not
useful to log the full stack trace leading to these exceptions.

This change introduces ConsensusMessageRejectedException for these cases, and
suppresses overly verbose logging when receiving such a message. It does not
include any of the extra machinery needed to pass these exceptions over the
wire.
ClaudioMFreitas pushed a commit to ClaudioMFreitas/elasticsearch-1 that referenced this issue Nov 12, 2019
Removed dots at end of task descriptions to have a common scheme
henningandersen pushed a commit to henningandersen/elasticsearch that referenced this issue Jun 4, 2020
* Updated following tests

* Corrected config and updated following review
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this issue Oct 2, 2023
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants