Skip to content

BugFix: Switchover (during a Node drain) fails randomly in synchronous mode #1984

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 19, 2022

Conversation

JBWatenbergScality
Copy link
Contributor

Change1: MigrateMasterPod now use getSwitchOverCandidate to select a new masterCandidate
Change2: Removed unused masterCandidate (replaced by getSwitchOverCandidate)
Fixes: #1983

@JBWatenbergScality JBWatenbergScality changed the title Synchronous switchover error BugFix: Switchover (during a Node drain) fails randomly in synchronous mode Aug 2, 2022
@JBWatenbergScality JBWatenbergScality force-pushed the synchronous-switchover-error branch 2 times, most recently from e55aec0 to 0700909 Compare August 2, 2022 20:54
@JBWatenbergScality JBWatenbergScality marked this pull request as ready for review August 2, 2022 20:59
@FxKu FxKu added this to the 1.9 milestone Aug 8, 2022
if err != nil {
return fmt.Errorf("could not get master candidate pod: %v", err)
}

Copy link
Member

@FxKu FxKu Aug 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getSwitchoverCandidate only picks a replica when state is running. I don't think we need another API call here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed but then I will need to change getSwitchoverCandidate to make it return a Pod instead of a NamespacedName as below methods require a Pod. Another alternative would be to create a new getSwitchoverCandidatePod and refactor getSwitchoverCandidate to use that one if we want to keep the old signature. Which approach would you prefer ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see. I think in this case, it's fine then. We would have to do the call anyway - whether it's here or in getSwitchoverCandidate. And the call does not happen each sync, so it's fine.

@JBWatenbergScality JBWatenbergScality force-pushed the synchronous-switchover-error branch from 0700909 to ffe6fe6 Compare August 12, 2022 15:32
@JBWatenbergScality
Copy link
Contributor Author

Hi @FxKu , I just rebased my branch as I saw the End-2-End tests were failing for reasons that doesn't seems to be related to my change. Not sure it will help through. Should I investigate those or assume that it is a flaky/ignore them ?

@Jan-M
Copy link
Member

Jan-M commented Aug 18, 2022

👍

1 similar comment
@FxKu
Copy link
Member

FxKu commented Aug 19, 2022

👍

@FxKu FxKu merged commit b91b69c into zalando:master Aug 19, 2022
@FxKu
Copy link
Member

FxKu commented Aug 19, 2022

Thanks @JBWatenbergScality for this nice contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Switchover (during a Node drain) fails randomly in synchronous mode
3 participants