Healthcheck error - i/o timeout #4210

richstokes · 2019-06-18T22:10:15Z

E0618 22:06:17.957750       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: read unix @->/tmp/nginx-status-server.sock: i/o timeout
E0618 22:06:18.933766       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: read unix @->/tmp/nginx-status-server.sock: i/o timeout
E0618 22:06:19.165781       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: i/o timeout
E0618 22:06:26.601754       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: read unix @->/tmp/nginx-status-server.sock: i/o timeout
I0618 22:06:28.135501       8 main.go:167] Received SIGTERM, shutting down
I0618 22:06:28.135547       8 nginx.go:358] Shutting down controller queues

-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:    0.23.0
  Build:      git-be1329b22
  Repository: https://github.com/kubernetes/ingress-nginx
-------------------------------------------------------------------------------

Image:         quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.23.0

kubectl version
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.3", GitCommit:"5e53fd6bc17c0dec8434817e69b04a25d8ae0ff0", GitTreeState:"clean", BuildDate:"2019-06-07T09:55:27Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.9", GitCommit:"16236ce91790d4c75b79f6ce96841db1c843e7d2", GitTreeState:"clean", BuildDate:"2019-03-25T06:30:48Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64

This is a kops provisioned cluster, running on AWS. The nginx controllers seem OK until they receive some traffic, and then they crash with the above errors. Sometimes the controllers will stay up for a period of time. It seems fairly random. Host resources are OK.

Sometimes I have also seen in the log E0618 21:57:07.096186 8 checker.go:57] healthcheck error: 500

Strange thing is this configuration has been working great for months, we've not changed anything. Is there anything else I can check on our side?

args:
            - /nginx-ingress-controller
            - --default-backend-service=$(POD_NAMESPACE)/default-http-backend
            - --default-ssl-certificate=default/le-cert
            - --configmap=$(POD_NAMESPACE)/nginx-configuration
            - --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
            - --udp-services-configmap=$(POD_NAMESPACE)/udp-services
            - --publish-service=$(POD_NAMESPACE)/ingress-nginx
            - --annotations-prefix=nginx.ingress.kubernetes.io

The text was updated successfully, but these errors were encountered:

Tisona · 2019-07-03T20:45:36Z

Have you manage to fix this? I have the same issue with 0.24.1.

w3irdrobot · 2019-07-03T21:07:42Z

We saw this issue recently. It appeared to be that our worker nodes were overloaded. We think kubernetes was trying to keep everything alive but ended up killing other services in the process. We added two nodes and haven't had the problem since.

richstokes · 2019-07-03T21:13:52Z

I set a workaround by increasing the health check timeout values. This seemed to give it a chance to settle down and since then no issues.

wmedlar · 2019-07-08T23:39:10Z

@richstokes what timeouts are working for you?

richstokes · 2019-07-08T23:45:12Z

I just doubled whatever they were set to.

joshbranham · 2019-08-14T23:40:56Z

We tried guaranteeing a whole CPU and 1GB memory per and that changed nothing. We are also running ModSecurity, I thought it was load via that but maybe unrelated.

aledbf · 2019-09-02T14:15:48Z

Closing. This is fixed in master #4487
If you want to test the fix, you can use the image quay.io/kubernetes-ingress-controller/nginx-ingress-controller:dev

aledbf closed this as completed Sep 2, 2019

joshbranham mentioned this issue Sep 5, 2019

REQUEST: New membership for @joshbranham kubernetes/org#1160

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Healthcheck error - i/o timeout #4210

Healthcheck error - i/o timeout #4210

richstokes commented Jun 18, 2019 •

edited

Loading

Tisona commented Jul 3, 2019

Uh oh!

w3irdrobot commented Jul 3, 2019

Uh oh!

richstokes commented Jul 3, 2019

Uh oh!

wmedlar commented Jul 8, 2019

Uh oh!

richstokes commented Jul 8, 2019

Uh oh!

joshbranham commented Aug 14, 2019

Uh oh!

aledbf commented Sep 2, 2019

Uh oh!

Healthcheck error - i/o timeout #4210

Healthcheck error - i/o timeout #4210

Comments

richstokes commented Jun 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tisona commented Jul 3, 2019

Uh oh!

w3irdrobot commented Jul 3, 2019

Uh oh!

richstokes commented Jul 3, 2019

Uh oh!

wmedlar commented Jul 8, 2019

Uh oh!

richstokes commented Jul 8, 2019

Uh oh!

joshbranham commented Aug 14, 2019

Uh oh!

aledbf commented Sep 2, 2019

Uh oh!

richstokes commented Jun 18, 2019 •

edited

Loading