Skip to content

handle follower dying while being promoted to leader #15134

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 3, 2025

Conversation

vporyadke
Copy link
Collaborator

Changelog entry

fix a bug where a tablet would remain inactive after failing during promotion from follower to leader #14961

Changelog category

  • Bugfix

Description for reviewers

...

Copy link

github-actions bot commented Feb 27, 2025

2025-02-27 12:29:09 UTC Pre-commit check linux-x86_64-relwithdebinfo for d6a877c has started.
2025-02-27 12:29:13 UTC Artifacts will be uploaded here
2025-02-27 12:32:09 UTC ya make is running...
🟡 2025-02-27 13:23:13 UTC Some tests failed, follow the links below. Going to retry failed tests...

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
19192 17831 0 10 1215 136

2025-02-27 13:24:54 UTC ya make is running... (failed tests rerun, try 2)
🟡 2025-02-27 13:38:56 UTC Some tests failed, follow the links below. Going to retry failed tests...

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
212 (only retried tests) 82 0 1 0 129

2025-02-27 13:39:06 UTC ya make is running... (failed tests rerun, try 3)
🔴 2025-02-27 13:51:31 UTC Some tests failed, follow the links below.

Test history | Ya make output | Test bloat | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
193 (only retried tests) 65 0 1 0 127

🟢 2025-02-27 13:51:43 UTC Build successful.
🟢 2025-02-27 13:52:04 UTC ydbd size 2.1 GiB changed* by +1.6 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: 3875124 merge: d6a877c diff diff %
ydbd size 2 286 685 128 Bytes 2 286 686 776 Bytes +1.6 KiB +0.000%
ydbd stripped size 479 417 824 Bytes 479 418 272 Bytes +448 Bytes +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Copy link

🟢 2025-02-27 12:29:26 UTC The validation of the Pull Request description is successful.

Copy link

github-actions bot commented Feb 27, 2025

2025-02-27 12:31:59 UTC Pre-commit check linux-x86_64-release-asan for d6a877c has started.
2025-02-27 12:32:13 UTC Artifacts will be uploaded here
2025-02-27 12:35:09 UTC ya make is running...
🟡 2025-02-27 13:41:44 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
11722 11560 0 109 24 29

2025-02-27 13:42:46 UTC ya make is running... (failed tests rerun, try 2)
🟡 2025-02-27 14:01:11 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
226 (only retried tests) 183 0 16 1 26

2025-02-27 14:01:20 UTC ya make is running... (failed tests rerun, try 3)
🟢 2025-02-27 14:12:49 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
76 (only retried tests) 50 0 0 0 26

🟢 2025-02-27 14:12:56 UTC Build successful.
🟢 2025-02-27 14:13:28 UTC ydbd size 3.7 GiB changed* by +3.3 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: 3875124 merge: d6a877c diff diff %
ydbd size 3 985 410 960 Bytes 3 985 414 336 Bytes +3.3 KiB +0.000%
ydbd stripped size 1 387 765 512 Bytes 1 387 767 048 Bytes +1.5 KiB +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@vporyadke vporyadke added the rebase-and-check Rebase PR with the current base branch and check label Feb 27, 2025
@github-actions github-actions bot removed the rebase-and-check Rebase PR with the current base branch and check label Feb 27, 2025
Copy link

github-actions bot commented Feb 27, 2025

2025-02-27 15:23:59 UTC Pre-commit check linux-x86_64-release-asan for 047d5b9 has started.
2025-02-27 15:24:26 UTC Artifacts will be uploaded here
2025-02-27 15:27:09 UTC ya make is running...
🟡 2025-02-27 16:34:05 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
11722 11604 0 78 7 33

2025-02-27 16:36:23 UTC ya make is running... (failed tests rerun, try 2)
🟡 2025-02-27 16:49:05 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
167 (only retried tests) 135 0 2 1 29

2025-02-27 16:49:17 UTC ya make is running... (failed tests rerun, try 3)
🟢 2025-02-27 16:59:58 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
60 (only retried tests) 31 0 0 0 29

🟢 2025-02-27 17:00:05 UTC Build successful.
🟢 2025-02-27 17:00:36 UTC ydbd size 3.7 GiB changed* by +3.3 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: c5b2980 merge: 047d5b9 diff diff %
ydbd size 3 985 411 216 Bytes 3 985 414 560 Bytes +3.3 KiB +0.000%
ydbd stripped size 1 387 765 576 Bytes 1 387 767 112 Bytes +1.5 KiB +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Copy link

github-actions bot commented Feb 27, 2025

2025-02-27 15:26:43 UTC Pre-commit check linux-x86_64-relwithdebinfo for 047d5b9 has started.
2025-02-27 15:26:47 UTC Artifacts will be uploaded here
2025-02-27 15:29:41 UTC ya make is running...
🟡 2025-02-27 16:15:24 UTC Some tests failed, follow the links below. Going to retry failed tests...

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
19192 17843 0 1 1215 133

2025-02-27 16:17:44 UTC ya make is running... (failed tests rerun, try 2)
🟢 2025-02-27 16:32:52 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
197 (only retried tests) 72 0 0 0 125

🟢 2025-02-27 16:33:26 UTC Build successful.
🟢 2025-02-27 16:34:10 UTC ydbd size 2.1 GiB changed* by +1.5 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: c5b2980 merge: 047d5b9 diff diff %
ydbd size 2 286 685 160 Bytes 2 286 686 744 Bytes +1.5 KiB +0.000%
ydbd stripped size 479 417 824 Bytes 479 418 272 Bytes +448 Bytes +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@vporyadke vporyadke requested a review from CyberROFL February 27, 2025 23:31
@vporyadke vporyadke linked an issue Feb 27, 2025 that may be closed by this pull request
ctx.Send(entry.Tablet, new TEvTablet::TEvPromoteToLeader(suggestedGen, info));
MarkDeadTablet(it->first, 0, TEvLocal::TEvTabletStatus::StatusSupersededByLeader, TEvTablet::TEvTabletDead::ReasonError, ctx);
OnlineTablets.erase(it);
it->second.IsPromoting = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Зачем два раза заполнять?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Я использую один и тот же флаг с обеих сторон - для фолловера, который в OnlineTablets, и для лидера, который в InbootTablets

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

entry.IsPromoting = true;

где entry — это:

entry = it->second;

Зачем тогда вот это?

it->second.IsPromoting = true;

Comment on lines 730 to 731
inbootIt->second.IsPromoting = false;
inbootIt->second.PromotingFromFollower = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Предлагаю сделать функции StartPromotion/FinishPromotion и заполнять эти поля в них.

Copy link

github-actions bot commented Mar 3, 2025

2025-03-03 09:09:31 UTC Pre-commit check linux-x86_64-relwithdebinfo for 2391fa7 has started.
2025-03-03 09:09:45 UTC Artifacts will be uploaded here
2025-03-03 09:12:29 UTC ya make is running...
🟢 2025-03-03 09:58:21 UTC Tests successful.

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
19162 17801 0 0 1227 134

🟢 2025-03-03 09:59:53 UTC Build successful.
🟢 2025-03-03 10:00:09 UTC ydbd size 2.1 GiB changed* by +1.6 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: b0bd8f9 merge: 2391fa7 diff diff %
ydbd size 2 288 336 672 Bytes 2 288 338 352 Bytes +1.6 KiB +0.000%
ydbd stripped size 479 642 688 Bytes 479 643 072 Bytes +384 Bytes +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Copy link

github-actions bot commented Mar 3, 2025

2025-03-03 09:09:50 UTC Pre-commit check linux-x86_64-release-asan for 2391fa7 has started.
2025-03-03 09:10:04 UTC Artifacts will be uploaded here
2025-03-03 09:12:49 UTC ya make is running...
🟡 2025-03-03 10:16:54 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
11683 11543 0 81 23 36

2025-03-03 10:18:00 UTC ya make is running... (failed tests rerun, try 2)
🟡 2025-03-03 10:31:18 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
195 (only retried tests) 164 0 1 0 30

2025-03-03 10:31:27 UTC ya make is running... (failed tests rerun, try 3)
🟢 2025-03-03 10:49:24 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
56 (only retried tests) 29 0 0 0 27

🟢 2025-03-03 10:49:30 UTC Build successful.
🟢 2025-03-03 10:49:57 UTC ydbd size 3.7 GiB changed* by +3.1 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: b0bd8f9 merge: 2391fa7 diff diff %
ydbd size 3 988 298 944 Bytes 3 988 302 128 Bytes +3.1 KiB +0.000%
ydbd stripped size 1 388 482 184 Bytes 1 388 483 464 Bytes +1.2 KiB +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@vporyadke vporyadke merged commit 1f60fc8 into ydb-platform:main Mar 3, 2025
12 checks passed
lberserq pushed a commit to lberserq/ydb that referenced this pull request Mar 3, 2025
vporyadke added a commit to vporyadke/ydb that referenced this pull request Mar 6, 2025
vporyadke added a commit to vporyadke/ydb that referenced this pull request Mar 6, 2025
vporyadke added a commit to vporyadke/ydb that referenced this pull request Mar 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Local does not notice failure to promote follower
2 participants