Skip to content

sql: deflake TestDropDatabaseDeleteData #140960

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 11, 2025
Merged

Conversation

iskettaneh
Copy link
Contributor

This commit deflakes the test by blocking the full reconciliation from starting before we drop the database. This is used to avoid a race condition where the full reconciliation starts after we drop the database, it will ignore the dropped database. Then, there is a race between the SQLWatcher and the GC job where the SQLWatcher might write a zone config of table1 before table2, while the GC queue might not know that this range needs to be split because it has different span config.

Release note: None

Fixes: #138185

This commit deflakes the test by blocking the full reconciliation from
starting before we drop the database. This is used to avoid a race
condition where the full reconciliation starts after we drop the
database, it will ignore the dropped database. Then, there is a race
between the SQLWatcher and the GC job where the SQLWatcher might write a
zone config of table1 before table2, while the GC queue might not know
that this range needs to be split because it has different span config.

Release note: None

Fixes: cockroachdb#138185
@iskettaneh iskettaneh requested review from a team as code owners February 10, 2025 02:42
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@iskettaneh
Copy link
Contributor Author

@arulajmani @rafiss for context, this PR fixes the second issue in the test where table2 is GCed with table1, even though we only set the TTL=0 for table1.

This seems to happen if:

  1. We drop the database
  2. The full reconciliator runs
  3. we set table1 zone config TTL=0
  4. SQLWatche streams that table1 has TTL=0
  5. GC Queue runs

At (5), the GC queue might not know about table2's zone config, and might delete both table1 and table2. However, if we swap (1) and (2), it won't delete table2's data with table1.

Copy link
Collaborator

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @rafiss)

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! thanks for the detailed analysis

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @iskettaneh)

@rafiss rafiss added the backport-25.1.x Flags PRs that need to be backported to 25.1 label Feb 11, 2025
@iskettaneh
Copy link
Contributor Author

TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 11, 2025

Build failed (retrying...):

  • unit_tests

@craig
Copy link
Contributor

craig bot commented Feb 11, 2025

@craig craig bot merged commit 037f9bf into cockroachdb:master Feb 11, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-25.1.x Flags PRs that need to be backported to 25.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: TestDropDatabaseDeleteData failed
4 participants