-
Notifications
You must be signed in to change notification settings - Fork 70
OCPBUGS-42150: (fix) registry pods do not come up again after node failure (#3366) #872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-42150: (fix) registry pods do not come up again after node failure (#3366) #872
Conversation
@anik120: This pull request references Jira Issue OCPBUGS-42150, which is valid. The bug has been moved to the POST state. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: anik120 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
435bfa5
to
6bb7fa7
Compare
…ilure (#3366) [PR 3201](operator-framework/operator-lifecycle-manager#3201) attempted to solve for the issue by deleting the pods stuck in `Terminating` due to unreachable node. However, the logic to do that was included in `EnsureRegistryServer`, which only gets executed if polling in requested by the user. This PR moves the logic of checking for dead pods out of `EnsureRegistryServer`, and puts it in `CheckRegistryServer` instead. This way, if there are any dead pods detected during `CheckRegistryServer`, the value of `healthy` is returned `false`, which inturn triggers `EnsureRegistryServer`. Upstream-repository: operator-lifecycle-manager Upstream-commit: f2431893193e7112f78298ad7682ff3e1b179d8c
6bb7fa7
to
a1c914c
Compare
/test e2e-gcp-olm-flaky |
Test pass, details: https://issues.redhat.com/browse/OCPBUGS-42150 |
@anik120: This pull request references Jira Issue OCPBUGS-42150, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/retest |
/label backport-risk-assessed |
/retest |
1 similar comment
@anik120: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
01eee9c
into
openshift:release-4.14
@anik120: Jira Issue OCPBUGS-42150: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-42150 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
[ART PR BUILD NOTIFIER] Distgit: operator-lifecycle-manager |
[ART PR BUILD NOTIFIER] Distgit: operator-registry |
PR 3201 attempted to solve for the issue by deleting the pods stuck in
Terminating
due to unreachable node. However, the logic to do that was included inEnsureRegistryServer
, which only gets executed if polling in requested by the user.This PR moves the logic of checking for dead pods out of
EnsureRegistryServer
, and puts it inCheckRegistryServer
instead. This way, if there are any dead pods detected duringCheckRegistryServer
, the value ofhealthy
is returnedfalse
, which inturn triggersEnsureRegistryServer
.Upstream-repository: operator-lifecycle-manager
Upstream-commit: f2431893193e7112f78298ad7682ff3e1b179d8c