Skip to content
This repository was archived by the owner on Apr 17, 2019. It is now read-only.

[Error] Getsockopt: connection refused - Kubernetes apiserver #2249

Closed
Tedezed opened this issue Jan 4, 2017 · 19 comments
Closed

[Error] Getsockopt: connection refused - Kubernetes apiserver #2249

Tedezed opened this issue Jan 4, 2017 · 19 comments

Comments

@Tedezed
Copy link

Tedezed commented Jan 4, 2017

Environment

  • Vagrant Node master Ubuntu 14 TLS - 192.168.33.10
  • Vagrant Node minion Ubuntu 14 TLS - 192.168.33.11
  • Deployed with ansible

Hi friends, I need help to solve this error.

My error: Reason: Get https://10.254.0.1:443/version: dial tcp 10.254.0.1:443: getsockopt: connection refused

Pods

NAME                                          READY     STATUS             RESTARTS   AGE
elasticsearch-logging-v1-4721j                1/1       Running            0          1h
elasticsearch-logging-v1-wyukc                1/1       Running            0          1h
fluentd-es-v1.20-5qpw7                        1/1       Running            0          1h
heapster-v1.0.2-2780992708-y0qbz              4/4       Running            0          1h
kibana-logging-v1-z9ffo                       0/1       CrashLoopBackOff   24         1h
kube-dns-v11-4x32n                            3/4       CrashLoopBackOff   32         1h
kubedash-vffq8                                1/1       Running            0          1h
kubernetes-dashboard-v1.1.0-ffgck             0/1       CrashLoopBackOff   26         1h
monitoring-influxdb-grafana-v3-cz6hm          2/2       Running            0          1h
traefik-ingress-controller-2249976834-v2aco   1/1       Running            0          1h

Service API Kubernetes

kubernetes   10.254.0.1   <none>        443/TCP   1h

Kube-system Errors

DNS

Readiness probe failed: Get http://172.16.55.6:8081/readiness: dial tcp 172.16.55.6:8081: getsockopt: connection refused
Liveness probe failed: HTTP probe failed with statuscode: 503
{kubelet 192.168.33.11}	spec.containers
(events with common reason combined)
Back-off restarting failed docker container
Error syncing pod, skipping: failed to "StartContainer" for "kube2sky" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube2sky pod=kube-dns-v11-4x32n_kube-system(dfe0d4b5-d27a-11e6-b4d9-0800277e8445)"

Dashboard

Starting HTTP server on port 9090
Creating API server client for https://10.254.0.1:443
Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service accounts configuration) or the --apiserver-host param points to a server that does not exist. Reason: Get https://10.254.0.1:443/version: dial tcp 10.254.0.1:443: getsockopt: connection refused

Traefik-ingress-controller

msg="Error watching kubernetes events: failed to create watch: failed to do version request: GET \"https://10.254.0.1:443/apis/extensions/v1beta1/ingresses\" : failed to create request: GET \"https://10.254.0.1:443/apis/extensions/v1beta1/ingresses\" : [Get https://10.254.0.1:443/apis/extensions/v1beta1/ingresses: dial tcp 10.254.0.1:443: getsockopt: connection refused]"

My tests

 kubectl exec test-701078429-3dzbj -- curl --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt -H  "Authorization: Bearer $TOKEN_VALUE" https://10.254.0.1
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to 10.254.0.1 port 443: Connection refused
curl --cacert /etc/kubernetes/certs/ca.crt https://192.168.33.10
Unauthorized
curl http://127.0.0.1:8080/version
{
  "major": "1",
  "minor": "4",
  "gitVersion": "v1.4.5",
  "gitCommit": "5a0a696437ad35c133c0c8493f7e9d22b0f9b81b",
  "gitTreeState": "clean",
  "buildDate": "2016-10-29T01:32:42Z",
  "goVersion": "go1.6.3",
  "compiler": "gc",
  "platform": "linux/amd64"
}
  • Errors in service account, admission-control, flannel or docker net?
@Tedezed Tedezed changed the title Getsockopt: connection refused - Kubernetes API [Error] Getsockopt: connection refused - Kubernetes API Jan 4, 2017
@Tedezed Tedezed changed the title [Error] Getsockopt: connection refused - Kubernetes API [Error] Getsockopt: connection refused - Kubernetes apiserver Jan 4, 2017
@ingvagabund
Copy link
Contributor

Hey @Tedezed,

try to change 443 to 6443 (under roles/kubernetes/default/main.yml). The 443 is privileged port and if you don't have proper capabilities set, the connection is refused.

@ingvagabund
Copy link
Contributor

Before you do that, take a look if you can list cluster nodes from the master node. If you can, then the problem can be in iptables.

@Tedezed
Copy link
Author

Tedezed commented Jan 5, 2017

Hi @ingvagabund, thanks for your reply.

Yes, the master can list nodes.

I modify: roles/kubernetes/default/main.yml

kube_master_api_port: 6443
cluster_port: "{{ master_cluster_port | default('443') }}"

to

kube_master_api_port: 6443
cluster_port: "{{ master_cluster_port | default('6443') }}"

or only kube_master_api_port?

@ingvagabund
Copy link
Contributor

ingvagabund commented Jan 6, 2017

In that case, the port can stay as it is. I am not much into networking but usually I have resolved very similar issue by introducing additional iptables rule on the master node:

iptables -t nat -A PREROUTING -d 10.254.0.1 -p tcp --dport 443 -j DNAT --to-destination 127.0.0.1

@Tedezed
Copy link
Author

Tedezed commented Jan 9, 2017

It's a good idea, but keep the error: connection refused

My test:

1 - Iptables:

echo "1" > /proc/sys/net/ipv4/ip_forward

sysctl net.ipv4.ip_forward=1

nano /etc/sysctl.conf

net.ipv4.ip_forward = 1

service networking restart

iptables -t nat -A PREROUTING -d 10.254.0.1 -p tcp --dport 443 -j DNAT --to-destination 127.0.0.1

OR

iptables -t nat -A PREROUTING -d 10.254.0.1 -p tcp --dport 443 -j DNAT --to-destination 127.0.0.1:443

iptables -t nat -L -n

Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination         
DNAT       tcp  --  0.0.0.0/0            10.254.0.1           tcp dpt:443 to:127.0.0.1

Log traefik-ingress-controller

time="2017-01-09T11:47:43Z" level=error msg="Kubernetes connection error failed to create watch: failed to do version request: GET \"https://10.254.0.1:443/apis/extensions/v1beta1/ingresses\" : failed to create request: GET \"https://10.254.0.1:443/apis/extensions/v1beta1/ingresses\" : [Get https://10.254.0.1:443/apis/extensi	ons/v1beta1/ingresses: dial tcp 10.254.0.1:443: getsockopt: connection refused], retrying in 6.577022913s"

2 - Deploy with port 6443

Edit roles/kubernetes/default/main.yml and /inventory/group_vars/all.yml

vagrant@master:~$ kubectl get nodes
NAME            STATUS    AGE
192.168.33.11   Ready     6m

Does not the port change?

vagrant@master:~$ kubectl get svc
NAME         CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   10.254.0.1   <none>        443/TCP   8m

Log traefik-ingress-controller

time="2017-01-09T12:24:54Z" level=error msg="Kubernetes connection error failed to create watch: failed to do version request: GET \"https://10.254.0.1:443/apis/extensions/v1beta1/ingresses\" : failed to create request: GET \"https://10.254.0.1:443/apis/extensions/v1beta1/ingresses\" : [Get https://10.254.0.1:443/apis/extensions/v1beta1/ingresses: dial tcp 10.254.0.1:443: getsockopt: connection refused], retrying in 10.001000939s"

Try this

coreos/coreos-kubernetes#215

@ravishivt
Copy link

ravishivt commented Jan 12, 2017

I had the same issue. Can you check your iptables FORWARD policy, iptables -t filter -L FORWARD? On my Ubuntu 16.04 instances it was defaulting to DROP which was the cause of why the dashboard couldn't access the API server. If it is, try changing it to ACCEPT on all of your nodes.

iptables -P FORWARD ACCEPT

You'll also need to edit kube-proxy and add --masquerade-all option. See kubernetes/kubernetes#24224.

I opened kubernetes/kubernetes#39823 for k8s to better handle this.

@jpiper
Copy link

jpiper commented Jan 13, 2017

One thing I've found that causes this if you change your TLS certificates on your apiserver(s), you'll need to refresh the service accounts as they'll now contain invalid tokens.

$ kubectl delete serviceaccount default
$ kubectl delete serviceaccount --namespace=kube-system default

@Tedezed
Copy link
Author

Tedezed commented Jan 16, 2017

@ravishivt

$ sudo iptables -t filter -L FORWARD
Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         

Add --masquerade-all in /etc/kubernetes/proxy or /etc/kubernetes/proxy.kubeconfig?

nano /etc/kubernetes/proxy
KUBE_PROXY_ARGS="--kubeconfig=/etc/kubernetes/proxy.kubeconfig --masquerade-all=true"

@jpiper I guess I have to re-create the serviceaccount for kube-system?

$ kubectl get pod -n kube-system
NAME                                          READY     STATUS    RESTARTS   AGE
elasticsearch-logging-v1-610yr                0/1       Error     3          7d
elasticsearch-logging-v1-bg1kz                0/1       Error     3          7d
fluentd-es-v1.20-mdl6z                        0/1       Error     3          7d
heapster-v1.0.2-869312971-mfn4d               0/4       Error     12         7d
kibana-logging-v1-le7sx                       0/1       Error     46         7d
kube-dns-v11-b9ugd                            0/4       Error     63         7d
monitoring-influxdb-grafana-v3-mwexr          0/2       Error     6          7d
traefik-ingress-controller-2249976834-gvnk4   1/1       Running   0          3m

@bjornl
Copy link

bjornl commented Jan 17, 2017

@Tedezed You may also assert that kernel parameters bridge-nf-call-iptables and bridge-nf-call-arptables is set to the value of 1

cat /proc/sys/net/bridge/bridge-nf-call-iptables
cat /proc/sys/net/bridge/bridge-nf-call-arptables

If the value is 0, you can set it with:
echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
echo 1 > /proc/sys/net/bridge/bridge-nf-call-arptables

@Tedezed
Copy link
Author

Tedezed commented Jan 19, 2017

@bjornl THX

# cat /proc/sys/net/bridge/bridge-nf-call-iptables
1
# cat /proc/sys/net/bridge/bridge-nf-call-arptables
1

I try it in CentOS 7 and I have the same problem.

@Tedezed
Copy link
Author

Tedezed commented Feb 14, 2017

I managed to run Kubernetes in OpenStack without problems with the following configuration, it although may just be Vagrant's problem.

FIle: all.yml

source_type: packageManager
cluster_name: cluster.local
master_cluster_hostname: kube_master1
ansible_ssh_user: user
ansible_ssh_pass: pass
ansible_sudo_pass: pass
kube_service_addresses: 10.254.0.0/16
networking: flannel
# Net local
opencontrail_public_subnet: 10.80.2.0/24
# Net OpenStack
opencontrail_private_subnet: 192.168.90.0/24
# Interface net OpenStack
opencontrail_interface: eth0
flannel_subnet: 172.16.0.0
flannel_prefix: 12
flannel_host_prefix: 24
cluster_logging: false
cluster_monitoring: false
kube_ui: false
kube_dash: true
node_problem_detector: false
dns_setup: true
dns_replicas: 1
apiserver_extra_args: "--runtime-config=extensions/v1beta1/deployments=true"

@zouhuigang
Copy link

How to solve it, I met the same problem in traefik.. @Tedezed

Failed to list *v1beta1.Ingress: Get https://10.254.0.1:443/apis/extensions/v1beta1/ingresses?resourceVersion=0: dial tcp 10.254.0.1:443: i/o timeout

@Tedezed
Copy link
Author

Tedezed commented Dec 20, 2017

@zouhuigang I really do not solve it in Vagrant. I recommend you use KVM for your machines.

@zouhuigang
Copy link

@Tedezed thanks ,I've solved it

 systemctl stop flannel && systemctl stop docker  &&  ip link delete docker0

@Tedezed
Copy link
Author

Tedezed commented Dec 26, 2017

@zouhuigang Great! Thanks for your solution, I will consider your solution for future tests.

@pmb311
Copy link

pmb311 commented Feb 12, 2018

@zouhuigang I had a similar problem that appeared to be caused by the API server silently dying on resource shortages. Once I gave the VM 2 CPU and 2 GB memory it worked.

@saurabh999
Copy link

I used service networking restart.
And it works.

thanks

@funky81
Copy link

funky81 commented Oct 1, 2018

Currently I'm still having these problems when I'm using centos 7.5, k8s 1.10.8 ,kubespray 2.6 and --masquerade-all option off.

Any solutions without --masquerade-all option?
Without this, my pod can't access to apiserver

@bdhobare
Copy link

Hey @pmb311 , how did you resize the apiserver's resources? I am using GKE

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants