GCP: installler based install with 4.17 fails #2127

andrebiegel · 2025-03-31T14:07:57Z

andrebiegel
Mar 31, 2025

Hi All,
i´m try<ing an installer based okd installation on gcp and it fails
can anybody give me a hint how to fix this... ? or how to start a a installation routine which works ?
sry i´m new to that that topic .. so be patient ...
log :
time="2025-04-02T10:34:29Z" level=info msg="Failed to gather bootstrap logs: failed to create SSH client: failed to use pre-existing agent, make sure the appropriate keys exist in the agent for authentication: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain"
time="2025-04-02T10:34:29Z" level=error msg="Cluster operator authentication Degraded is True with IngressStateEndpoints_MissingSubsets::OAuthServerServiceEndpointAccessibleController_SyncError::OAuthServerServiceEndpointsEndpointAccessibleController_SyncError: IngressStateEndpointsDegraded: No subsets found for the endpoints of oauth-server\nOAuthServerServiceEndpointAccessibleControllerDegraded: Get "https://172.30.250.44:443/healthz\": dial tcp 172.30.250.44:443: connect: connection refused\nOAuthServerServiceEndpointsEndpointAccessibleControllerDegraded: oauth service endpoints are not ready"
time="2025-04-02T10:34:29Z" level=error msg="Cluster operator authentication Available is False with APIServices_PreconditionNotReady::OAuthServerServiceEndpointAccessibleController_EndpointUnavailable::OAuthServerServiceEndpointsEndpointAccessibleController_ResourceNotFound: APIServicesAvailable: PreconditionNotReady\nOAuthServerServiceEndpointAccessibleControllerAvailable: Get "https://172.30.250.44:443/healthz\": dial tcp 172.30.250.44:443: connect: connection refused\nOAuthServerServiceEndpointsEndpointAccessibleControllerAvailable: endpoints "oauth-openshift" not found"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator authentication EvaluationConditionsDetected is Unknown with NoData: "
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator baremetal Disabled is False with : "
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator cloud-controller-manager TrustedCABundleControllerControllerAvailable is True with AsExpected: Trusted CA Bundle Controller works as expected"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator cloud-controller-manager TrustedCABundleControllerControllerDegraded is False with AsExpected: Trusted CA Bundle Controller works as expected"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator cloud-controller-manager CloudConfigControllerAvailable is True with AsExpected: Cloud Config Controller works as expected"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator cloud-controller-manager CloudConfigControllerDegraded is False with AsExpected: Cloud Config Controller works as expected"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator config-operator EvaluationConditionsDetected is Unknown with NoData: "
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator csi-snapshot-controller EvaluationConditionsDetected is Unknown with NoData: "
time="2025-04-02T10:34:29Z" level=error msg="Cluster operator etcd Degraded is True with GuardController_SyncError::MissingStaticPodController_SyncError: GuardControllerDegraded: [Missing operand on node dc-okd-cspwl-master-1.europe-west1-c.c.dc-okd.internal, Missing operand on node dc-okd-cspwl-master-2.europe-west1-d.c.dc-okd.internal]\nMissingStaticPodControllerDegraded: static pod lifecycle failure - static pod: "etcd" in namespace: "openshift-etcd" for revision: 3 on node: "dc-okd-cspwl-master-1.europe-west1-c.c.dc-okd.internal" didn't show up, waited: 3m0s"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator etcd Progressing is True with NodeInstaller: NodeInstallerProgressing: 2 nodes are at revision 0; 1 node is at revision 1; 0 nodes have achieved new revision 3"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator etcd EvaluationConditionsDetected is Unknown with NoData: "
time="2025-04-02T10:34:29Z" level=error msg="Cluster operator ingress Available is False with IngressUnavailable: The "default" ingress controller reports Available=False: IngressControllerUnavailable: One or more status conditions indicate unavailable: DeploymentAvailable=False (DeploymentUnavailable: The deployment has Available status condition set to False (reason: MinimumReplicasUnavailable) with message: Deployment does not have minimum availability.)"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator ingress Progressing is True with Reconciling: ingresscontroller "default" is progressing: IngressControllerProgressing: One or more status conditions indicate progressing: DeploymentRollingOut=True (DeploymentRollingOut: Waiting for router deployment rollout to finish: 0 of 2 updated replica(s) are available...\n).\nNot all ingress controllers are available."
time="2025-04-02T10:34:29Z" level=error msg="Cluster operator ingress Degraded is True with IngressDegraded: The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: DeploymentAvailable=False (DeploymentUnavailable: The deployment has Available status condition set to False (reason: MinimumReplicasUnavailable) with message: Deployment does not have minimum availability.), DeploymentReplicasMinAvailable=False (DeploymentMinimumReplicasNotMet: 0/2 of replicas are available, max unavailable is 1)"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator ingress EvaluationConditionsDetected is False with AsExpected: "
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator insights ClusterTransferAvailable is False with NoClusterTransfer: no available cluster transfer"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator insights Disabled is False with AsExpected: "
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator insights RemoteConfigurationAvailable is True with AsExpected: "
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator insights RemoteConfigurationValid is True with AsExpected: "
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator insights SCAAvailable is True with Updated: SCA certs successfully updated in the etc-pki-entitlement secret"
time="2025-04-02T10:34:29Z" level=error msg="Cluster operator kube-apiserver Degraded is True with GuardController_SyncError::NodeInstaller_InstallerPodFailed: GuardControllerDegraded: [Missing operand on node dc-okd-cspwl-master-0.europe-west1-b.c.dc-okd.internal, Missing operand on node dc-okd-cspwl-master-1.europe-west1-c.c.dc-okd.internal, Missing operand on node dc-okd-cspwl-master-2.europe-west1-d.c.dc-okd.internal]\nNodeInstallerDegraded: 1 nodes are failing on revision 6:\nNodeInstallerDegraded: installer: The container could not be located when the pod was terminated"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator kube-apiserver Progressing is True with NodeInstaller: NodeInstallerProgressing: 3 nodes are at revision 0; 0 nodes have achieved new revision 6"
time="2025-04-02T10:34:29Z" level=error msg="Cluster operator kube-apiserver Available is False with StaticPods_ZeroNodesActive: StaticPodsAvailable: 0 nodes are active; 3 nodes are at revision 0; 0 nodes have achieved new revision 6"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator kube-apiserver EvaluationConditionsDetected is False with AsExpected: All is well"
time="2025-04-02T10:34:29Z" level=error msg="Cluster operator kube-controller-manager Degraded is True with GuardController_SyncError::MissingStaticPodController_SyncError: GuardControllerDegraded: [Missing operand on node dc-okd-cspwl-master-1.europe-west1-c.c.dc-okd.internal, Missing operand on node dc-okd-cspwl-master-2.europe-west1-d.c.dc-okd.internal]\nMissingStaticPodControllerDegraded: static pod lifecycle failure - static pod: "kube-controller-manager" in namespace: "openshift-kube-controller-manager" for revision: 4 on node: "dc-okd-cspwl-master-1.europe-west1-c.c.dc-okd.internal" didn't show up, waited: 3m0s"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator kube-controller-manager Progressing is True with NodeInstaller: NodeInstallerProgressing: 2 nodes are at revision 0; 1 node is at revision 3; 0 nodes have achieved new revision 4"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator kube-controller-manager EvaluationConditionsDetected is Unknown with NoData: "
time="2025-04-02T10:34:29Z" level=error msg="Cluster operator kube-scheduler Degraded is True with GuardController_SyncError::MissingStaticPodController_SyncError: GuardControllerDegraded: Missing operand on node dc-okd-cspwl-master-0.europe-west1-b.c.dc-okd.internal\nMissingStaticPodControllerDegraded: static pod lifecycle failure - static pod: "openshift-kube-scheduler" in namespace: "openshift-kube-scheduler" for revision: 7 on node: "dc-okd-cspwl-master-0.europe-west1-b.c.dc-okd.internal" didn't show up, waited: 3m0s"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator kube-scheduler Progressing is True with NodeInstaller: NodeInstallerProgressing: 1 node is at revision 0; 2 nodes are at revision 6; 0 nodes have achieved new revision 7"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator kube-scheduler EvaluationConditionsDetected is Unknown with NoData: "
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator kube-storage-version-migrator EvaluationConditionsDetected is Unknown with NoData: "
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator machine-config EvaluationConditionsDetected is False with AsExpected: "
time="2025-04-02T10:34:29Z" level=error msg="Cluster operator monitoring Available is False with PlatformTasksFailed: UpdatingAlertmanager: reconciling Alertmanager Route failed: creating Route object failed: the server could not find the requested resource (post routes.route.openshift.io), UpdatingThanosQuerier: reconciling Thanos Querier Route failed: creating Route object failed: the server could not find the requested resource (post routes.route.openshift.io), UpdatingConsolePluginComponents: reconciling Console Plugin failed: waiting for ConsolePlugin failed: context deadline exceeded: creating ConsolePlugin object failed: the server could not find the requested resource (post consoleplugins.console.openshift.io), UpdatingPrometheus: reconciling Prometheus API Route failed: creating Route object failed: the server could not find the requested resource (post routes.route.openshift.io), UpdatingPrometheus: Prometheus "openshift-monitoring/k8s": failed to get: prometheuses.monitoring.coreos.com "k8s" not found"
time="2025-04-02T10:34:29Z" level=error msg="Cluster operator monitoring Degraded is True with PlatformTasksFailed: UpdatingAlertmanager: reconciling Alertmanager Route failed: creating Route object failed: the server could not find the requested resource (post routes.route.openshift.io), UpdatingThanosQuerier: reconciling Thanos Querier Route failed: creating Route object failed: the server could not find the requested resource (post routes.route.openshift.io), UpdatingConsolePluginComponents: reconciling Console Plugin failed: waiting for ConsolePlugin failed: context deadline exceeded: creating ConsolePlugin object failed: the server could not find the requested resource (post consoleplugins.console.openshift.io), UpdatingPrometheus: reconciling Prometheus API Route failed: creating Route object failed: the server could not find the requested resource (post routes.route.openshift.io), UpdatingPrometheus: Prometheus "openshift-monitoring/k8s": failed to get: prometheuses.monitoring.coreos.com "k8s" not found"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator monitoring Progressing is True with RollOutInProgress: Rolling out the stack."
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator network ManagementStateDegraded is False with : "
time="2025-04-02T10:34:29Z" level=error msg="Cluster operator openshift-apiserver Available is False with APIServices_PreconditionNotReady: APIServicesAvailable: PreconditionNotReady"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator openshift-apiserver EvaluationConditionsDetected is Unknown with NoData: "
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator openshift-controller-manager EvaluationConditionsDetected is Unknown with NoData: "
time="2025-04-02T10:34:29Z" level=error msg="Cluster operator operator-lifecycle-manager-packageserver Available is False with ClusterServiceVersionNotSucceeded: ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver observed in phase Failed with reason: InstallCheckFailed, message: install timeout"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator operator-lifecycle-manager-packageserver Progressing is True with : Working toward 0.0.1-snapshot"
time="2025-04-02T10:34:29Z" level=info msg="Cluster operator storage EvaluationConditionsDetected is Unknown with NoData: "
time="2025-04-02T10:34:29Z" level=error msg="Bootstrap failed to complete: timed out waiting for the condition"
time="2025-04-02T10:34:29Z" level=error msg="Failed to wait for bootstrapping to complete. This error usually happens when there is a problem with control plane hosts that prevents the control plane operators from creating the control plane."
time="2025-04-02T10:34:29Z" level=error msg="Invalid log bundle or the bootstrap machine could not be reached and bootstrap logs were not collected"
time="2025-04-02T10:34:29Z" level=info msg="Bootstrap gather logs captured here "oc-install/log-bundle-20250402103424.tar.gz""

andrebiegel · 2025-04-02T13:31:21Z

andrebiegel
Apr 2, 2025
Author

the installation is pretty default ... the only think i changed in install-config.yaml was the amount of worker replicas (set to 0 ) as described in the documentation to create a 3-node cluster

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GCP: installler based install with 4.17 fails #2127

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

GCP: installler based install with 4.17 fails #2127

andrebiegel Mar 31, 2025

Replies: 1 comment

andrebiegel Apr 2, 2025 Author

andrebiegel
Mar 31, 2025

andrebiegel
Apr 2, 2025
Author