Skip to content

sql/stats: bring back guard against non-zero NumRange in forecasts #144037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 8, 2025

Conversation

michae2
Copy link
Collaborator

@michae2 michae2 commented Apr 8, 2025

It occurred to me tonight that the code we removed in #143955 not only generated a sentry report, but also returned an error instead of producing a faulty histogram. I too hope that #93892 is now fixed, but it seems wise to at least keep some code that guards against faulty histograms, even if we don't think the sentry report is necessary any more.

Informs: #93892

Epic: None

Release note: None

It occurred to me tonight that the code we removed in cockroachdb#143955 not only
generated a sentry report, but also returned an error instead of
producing a faulty histogram. I too hope that cockroachdb#93892 is now fixed, but
it seems wise to at least keep some code that guards against faulty
histograms, even if we don't think the sentry report is necessary any
more.

Informs: cockroachdb#93892

Epic: None

Release note: None
@michae2 michae2 requested a review from yuzefovich April 8, 2025 05:27
@michae2 michae2 requested a review from a team as a code owner April 8, 2025 05:27
Copy link

blathers-crl bot commented Apr 8, 2025

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@michae2 michae2 requested a review from a team April 8, 2025 05:28
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this! I agree that more guardrails never hurt - I didn't realize that the check was useful as a guardrail.

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @michae2)


pkg/sql/stats/forecast.go line 335 at r1 (raw file):

		forecast.setHistogramBuckets(hist)

		// Verify that the first two buckets (the initial NULL bucket and the first

nit: this verification is stricter than the one we do in props/histogram.go - there we return "the first bucket should have NumRange=0" assertion error in two spots, but in both we're only looking at 0th bucket and only NumRange value (multiplied by selectivity). Why do we deviate here and verify more?

Copy link
Collaborator Author

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/stats/forecast.go line 335 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: this verification is stricter than the one we do in props/histogram.go - there we return "the first bucket should have NumRange=0" assertion error in two spots, but in both we're only looking at 0th bucket and only NumRange value (multiplied by selectivity). Why do we deviate here and verify more?

In props/histogram.go in both filter and maxDistinctValuesCount we don't know whether we're working with a portion of a histogram (one that has already been filtered) or an entire histogram. So we only check the first bucket.

Here in forecasting we know we're working with the entire histogram, including the synthesized NULL bucket if it exists, so we might as well check that there's nothing between the synthesized NULL bucket and the first non-NULL value.

@michae2
Copy link
Collaborator Author

michae2 commented Apr 8, 2025

TFTR!

bors r=yuzefovich

@craig
Copy link
Contributor

craig bot commented Apr 8, 2025

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Apr 8, 2025

This PR was included in a batch that successfully built, but then failed to merge into master (it was a non-fast-forward update). It will be automatically retried.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @michae2)


pkg/sql/stats/forecast.go line 335 at r1 (raw file):

Previously, michae2 (Michael Erickson) wrote…

In props/histogram.go in both filter and maxDistinctValuesCount we don't know whether we're working with a portion of a histogram (one that has already been filtered) or an entire histogram. So we only check the first bucket.

Here in forecasting we know we're working with the entire histogram, including the synthesized NULL bucket if it exists, so we might as well check that there's nothing between the synthesized NULL bucket and the first non-NULL value.

I see, makes sense about the bucket, thanks.

Other part of my comment was about also verifying DistinctRange - it doesn't look like we assert anything about that in props/histogram. Do we add that check here since NumRange = 0 and DistinctRange > 0 doesn't make sense, in general, even if we don't assert that later?

@craig
Copy link
Contributor

craig bot commented Apr 8, 2025

@craig craig bot merged commit c285f81 into cockroachdb:master Apr 8, 2025
24 checks passed
@michae2 michae2 deleted the unrevert-guard branch April 8, 2025 21:34
Copy link
Collaborator Author

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/stats/forecast.go line 335 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I see, makes sense about the bucket, thanks.

Other part of my comment was about also verifying DistinctRange - it doesn't look like we assert anything about that in props/histogram. Do we add that check here since NumRange = 0 and DistinctRange > 0 doesn't make sense, in general, even if we don't assert that later?

yes, that's right

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants