Skip to content

Cluster validation fails even though cluster has been successfully created #678

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
girikuncoro opened this issue Jan 14, 2019 · 9 comments
Closed
Labels
area/clusterctl Issues or PRs related to clusterctl kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@girikuncoro
Copy link
Contributor

/kind bug

What steps did you take and what happened:
[A clear and concise description of what the bug is.]
Running cluster validation after cluster has been created

$ kubectl --kubeconfig kubeconfig get node
NAME               STATUS   ROLES    AGE     VERSION
gce-master-5kbc8   Ready    master   11m     v1.12.0
gce-node-gm7mx     Ready    <none>   4m50s   v1.12.0
$ kubectl --kubeconfig kubeconfig get machines
NAME               AGE
gce-master-5kbc8   11m
gce-node-gm7mx     11m
$ clusterctl validate cluster --kubeconfig kubeconfig
I0115 06:09:18.508437   44513 machineactuator.go:813] Using the default GCP client
I0115 06:09:18.508793   44513 plugins.go:39] Registered cluster provisioner "google"
Validating Cluster API objects in namespace ""
Checking cluster object "test1-chl6h"... PASS
Checking machine object "gce-master-5kbc8"... FAIL
	The corresponding node is missing.
Checking machine object "gce-node-gm7mx"... FAIL
	The corresponding node is missing.
ERROR: Machine objects failed the validation.

What did you expect to happen:
All checks should have passed since all machine objects exist

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • Cluster-api version:
  • Kubernetes version: (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.1", GitCommit:"eec55b9ba98609a46fee712359c7b5b365bdd920", GitTreeState:"clean", BuildDate:"2018-12-13T19:44:10Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.0", GitCommit:"0ed33881dc4355495f623c6f22e7dd0b7632b7c0", GitTreeState:"clean", BuildDate:"2018-09-27T16:55:41Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
  • OS (e.g. from /etc/os-release):
Ubuntu 16.04.5 LTS (Xenial Xerus)
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jan 14, 2019
@vincepri
Copy link
Member

I think this is caused by NodeRef not being populated in the GCP provider (same goes for AWS provider). This code https://github.com/kubernetes-sigs/cluster-api/blob/master/cmd/clusterctl/validation/validate_cluster_api_objects.go#L97-L134 uses NodeRef (which is also optional) to get the node name, which seems unnecessary (unless I'm missing something) given that we can get the name from the Machine object.

@girikuncoro
Copy link
Contributor Author

Thanks for suggestion, I'll try to fix this

@roberthbailey roberthbailey added this to the v1alpha1 milestone Jan 23, 2019
@timothysc timothysc added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jan 28, 2019
@roberthbailey
Copy link

/assign @girikuncoro

@ncdc
Copy link
Contributor

ncdc commented Feb 28, 2019

This sounds like a provider specific issue; however, I do think it's related to #520 - does that sound right @davidewatson?

@ncdc
Copy link
Contributor

ncdc commented Mar 6, 2019

Moving to Next, since we've deferred #520. Please let me know if you think this should stay in v1alpha1.

/milestone Next

@k8s-ci-robot k8s-ci-robot modified the milestones: v1alpha1, Next Mar 6, 2019
@lzang
Copy link

lzang commented Mar 14, 2019

Hit similar issue on my setup, then figured out that for the validation to pass, NodeRef needs to be not nil, which means the node needs to have the correct annotation. You can describe your node and see if they have the annotation "cluster.k8s.io/machine"

@vincepri
Copy link
Member

/area ux

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 8, 2019
@timothysc timothysc added the area/clusterctl Issues or PRs related to clusterctl label Sep 26, 2019
@timothysc
Copy link
Member

closing, we are going to take a different road.

jayunit100 pushed a commit to jayunit100/cluster-api that referenced this issue Jan 31, 2020
…t-docs

Document the API endpoint discovery process
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clusterctl Issues or PRs related to clusterctl kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

8 participants