Skip to content

Pods not routable after updating deployment #1135

Closed
@tmansson

Description

@tmansson

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:

After updating a deployment (any deployment) the pods in that deployment is no longer routable through its service endpoint.

If I look at the service that I'm trying to connect to it says Cluster-IP - 10.0.192.108
And the pod itself says IP: 10.244.1.7

Before updating the deployment, this is what is logged in nginx-ingress-controller when performing a successful request.

2017-08-15T12:56:53.801501493Z 10.244.1.1 - [10.244.1.1] - - [15/Aug/2017:12:56:53 +0000] "GET /css/style.css HTTP/2.0" 200 243 "https://zept.io/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 53 0.005 [zeptio-zeptio-web-80] 10.244.1.7:3000 361 0.005 200

However, looking at the nginx-ingress-controller log after updating, it still says the same IP:

2017-08-15T13:08:43.621796454Z 2017/08/15 13:08:43 [error] 401#401: *231611 upstream timed out (110: Connection timed out) while connecting to upstream, client: 127.0.0.1, server: zept.io, request: "GET / HTTP/2.0", upstream: "http://10.244.1.7:3000/", host: "zept.io"

Even though the new pod has the IP 10.244.1.30, the controller doesn't route the traffic correctly.

What you expected to happen:

I'm fairly new to kubernetes, but to me it seems the IP of the service is not updated.

I'd say the ingress should try to connect using the new IP assigned to the newly created pod.

How to reproduce it (as minimally and precisely as possible):

I'll tell you my ingress setup, since that's probably where this goes wrong. It's setup using Helm, and has been working fine for about 1 year now. This issue came about a few months ago, and persisted after upgrading the cluster from 1.5.1 to 1.6.6

kube-lego-0.1.10
nginx-ingress-0.6.0

Anything else we need to know?:

Rebooting the agent node in ACS renews the routes and brings all the pods back online. I've use this "solution" until I updated to v1.6.6. But since that didn't solve it, I'm now trying to find what's wrong.

Environment:

  • Kubernetes version (use kubectl version): 1.6.6
  • Cloud provider or hardware configuration**: Azure Container Service
  • OS (e.g. from /etc/os-release): Ubuntu 16.04.2 LTS
  • Kernel (e.g. uname -a): Linux k8s-master-XXX-0 4.4.0-83-generic GCE: dont sync broken lbs #106-Ubuntu SMP Mon Jun 26 17:54:43 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools: az acs create
  • Others:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions