Cluster validation fails even though cluster has been successfully created #678

girikuncoro · 2019-01-14T23:17:50Z

/kind bug

What steps did you take and what happened:
[A clear and concise description of what the bug is.]
Running cluster validation after cluster has been created

$ kubectl --kubeconfig kubeconfig get node
NAME               STATUS   ROLES    AGE     VERSION
gce-master-5kbc8   Ready    master   11m     v1.12.0
gce-node-gm7mx     Ready    <none>   4m50s   v1.12.0

$ kubectl --kubeconfig kubeconfig get machines
NAME               AGE
gce-master-5kbc8   11m
gce-node-gm7mx     11m

$ clusterctl validate cluster --kubeconfig kubeconfig
I0115 06:09:18.508437   44513 machineactuator.go:813] Using the default GCP client
I0115 06:09:18.508793   44513 plugins.go:39] Registered cluster provisioner "google"
Validating Cluster API objects in namespace ""
Checking cluster object "test1-chl6h"... PASS
Checking machine object "gce-master-5kbc8"... FAIL
	The corresponding node is missing.
Checking machine object "gce-node-gm7mx"... FAIL
	The corresponding node is missing.
ERROR: Machine objects failed the validation.

What did you expect to happen:
All checks should have passed since all machine objects exist

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

Cluster-api version:
Kubernetes version: (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.1", GitCommit:"eec55b9ba98609a46fee712359c7b5b365bdd920", GitTreeState:"clean", BuildDate:"2018-12-13T19:44:10Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.0", GitCommit:"0ed33881dc4355495f623c6f22e7dd0b7632b7c0", GitTreeState:"clean", BuildDate:"2018-09-27T16:55:41Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}

OS (e.g. from /etc/os-release):

Ubuntu 16.04.5 LTS (Xenial Xerus)

The text was updated successfully, but these errors were encountered:

vincepri · 2019-01-15T18:50:44Z

I think this is caused by NodeRef not being populated in the GCP provider (same goes for AWS provider). This code https://github.com/kubernetes-sigs/cluster-api/blob/master/cmd/clusterctl/validation/validate_cluster_api_objects.go#L97-L134 uses NodeRef (which is also optional) to get the node name, which seems unnecessary (unless I'm missing something) given that we can get the name from the Machine object.

girikuncoro · 2019-01-21T07:35:55Z

Thanks for suggestion, I'll try to fix this

roberthbailey · 2019-01-28T20:30:57Z

/assign @girikuncoro

ncdc · 2019-02-28T17:02:21Z

This sounds like a provider specific issue; however, I do think it's related to #520 - does that sound right @davidewatson?

ncdc · 2019-03-06T20:10:28Z

Moving to Next, since we've deferred #520. Please let me know if you think this should stay in v1alpha1.

/milestone Next

lzang · 2019-03-14T06:10:34Z

Hit similar issue on my setup, then figured out that for the validation to pass, NodeRef needs to be not nil, which means the node needs to have the correct annotation. You can describe your node and see if they have the annotation "cluster.k8s.io/machine"

vincepri · 2019-06-10T15:35:00Z

/area ux

fejta-bot · 2019-09-08T16:35:36Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

timothysc · 2019-09-27T18:13:01Z

closing, we are going to take a different road.

…t-docs Document the API endpoint discovery process

k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jan 14, 2019

roberthbailey added this to the v1alpha1 milestone Jan 23, 2019

timothysc added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jan 28, 2019

k8s-ci-robot assigned girikuncoro Jan 28, 2019

k8s-ci-robot modified the milestones: v1alpha1, Next Mar 6, 2019

k8s-ci-robot added the area/ux label Jun 10, 2019

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 8, 2019

timothysc unassigned girikuncoro Sep 26, 2019

timothysc added the area/clusterctl Issues or PRs related to clusterctl label Sep 26, 2019

timothysc closed this as completed Sep 27, 2019

jayunit100 pushed a commit to jayunit100/cluster-api that referenced this issue Jan 31, 2020

Merge pull request kubernetes-sigs#678 from akutz/feature/api-endpoin…

c45a65d

…t-docs Document the API endpoint discovery process

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cluster validation fails even though cluster has been successfully created #678

Cluster validation fails even though cluster has been successfully created #678

girikuncoro commented Jan 14, 2019

vincepri commented Jan 15, 2019

girikuncoro commented Jan 21, 2019

roberthbailey commented Jan 28, 2019

ncdc commented Feb 28, 2019

ncdc commented Mar 6, 2019

lzang commented Mar 14, 2019

vincepri commented Jun 10, 2019

fejta-bot commented Sep 8, 2019

timothysc commented Sep 27, 2019

Cluster validation fails even though cluster has been successfully created #678

Cluster validation fails even though cluster has been successfully created #678

Comments

girikuncoro commented Jan 14, 2019

vincepri commented Jan 15, 2019

girikuncoro commented Jan 21, 2019

roberthbailey commented Jan 28, 2019

ncdc commented Feb 28, 2019

ncdc commented Mar 6, 2019

lzang commented Mar 14, 2019

vincepri commented Jun 10, 2019

fejta-bot commented Sep 8, 2019

timothysc commented Sep 27, 2019