Skip to content

feat(grouping): Add hashing_metadata field to GroupHashMetadata table #80531

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 15, 2024

Conversation

lobsterkatie
Copy link
Member

@lobsterkatie lobsterkatie commented Nov 11, 2024

This adds a new field, hashing_metadata, to the GroupHashMetadata table, to serve as a compliment to the hash_basis field added in #79835.

Whereas that field stores the overall grouping method (stacktrace, message, custom fingerprint, etc.), this new field will store more detailed, grouping-method specific data. For example, when grouping on message, it will store whether or not the message was parameterized; when grouping on fingerprint, it will store the source of the fingerprint; when grouping on stacktrace, it will note whether the stacktrace was found in an exception or in a thread. (The full scope of the data stored for each grouping method can be found in the XYZHashingMetadata types added in this PR.)

Code to add data to this field is included in #80534.

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Nov 11, 2024

This comment was marked as outdated.

@lobsterkatie lobsterkatie marked this pull request as ready for review November 11, 2024 18:02
@lobsterkatie lobsterkatie requested a review from a team as a code owner November 11, 2024 18:02
@lobsterkatie lobsterkatie requested a review from a team November 11, 2024 18:21
@lobsterkatie lobsterkatie force-pushed the kmclb-add-grouping-component-subtypes branch from 025d660 to ba4b790 Compare November 11, 2024 19:00
@lobsterkatie lobsterkatie force-pushed the kmclb-add-hashing-metadata-types-and-field branch from 346966d to 3088a38 Compare November 11, 2024 19:00
@lobsterkatie lobsterkatie force-pushed the kmclb-add-grouping-component-subtypes branch from ba4b790 to f4d76ec Compare November 11, 2024 19:09
@lobsterkatie lobsterkatie force-pushed the kmclb-add-hashing-metadata-types-and-field branch from 3088a38 to 17feb79 Compare November 11, 2024 19:09
Copy link

codecov bot commented Nov 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #80531      +/-   ##
==========================================
- Coverage   78.41%   78.41%   -0.01%     
==========================================
  Files        7209     7210       +1     
  Lines      319651   319691      +40     
  Branches    44003    44003              
==========================================
+ Hits       250669   250700      +31     
- Misses      62596    62605       +9     
  Partials     6386     6386              

@lobsterkatie lobsterkatie force-pushed the kmclb-add-grouping-component-subtypes branch from f4d76ec to a2827ab Compare November 11, 2024 20:17
@lobsterkatie lobsterkatie force-pushed the kmclb-add-hashing-metadata-types-and-field branch from 17feb79 to 9851620 Compare November 11, 2024 20:17
@lobsterkatie lobsterkatie force-pushed the kmclb-add-hashing-metadata-types-and-field branch from 9851620 to 9dc1640 Compare November 12, 2024 06:36
@lobsterkatie lobsterkatie force-pushed the kmclb-add-grouping-component-subtypes branch from a2827ab to 9722292 Compare November 12, 2024 07:06
@lobsterkatie lobsterkatie force-pushed the kmclb-add-hashing-metadata-types-and-field branch from 9dc1640 to 1484bfd Compare November 12, 2024 07:06
@lobsterkatie lobsterkatie force-pushed the kmclb-add-hashing-metadata-types-and-field branch from 1484bfd to 15764da Compare November 14, 2024 01:19
@lobsterkatie lobsterkatie requested review from a team as code owners November 14, 2024 01:19
@lobsterkatie
Copy link
Member Author

if we use GZipDictField, these parameters won't be queryable in redash/bigquery, which means that overall our idea to use postgres as a datastore won't work.

Right! That's why I originally had it as a JSON field. I couldn't remember why I'd chosen that over the gzipdict, and so I swapped it in order to get the defaults-to-an-empty-dict-rather-than-null behavior. But past-past-me was smarter than recent-past me - you're absolutely right, it should be JSON.

Thank you for catching that!

@lobsterkatie lobsterkatie force-pushed the kmclb-add-grouping-component-subclasses branch from 7347bfa to b9eb3d7 Compare November 14, 2024 22:22
@lobsterkatie lobsterkatie force-pushed the kmclb-add-hashing-metadata-types-and-field branch from 8150b8a to 9fa314a Compare November 14, 2024 22:22
Copy link
Contributor

This PR has a migration; here is the generated SQL for src/sentry/migrations/0791_add_hashing_metadata_to_grouphash_metadata.py ()

--
-- Add field hashing_metadata to grouphashmetadata
--
ALTER TABLE "sentry_grouphashmetadata" ADD COLUMN "hashing_metadata" text NULL;

# Metadata about the inputs to the hashing process and the hashing process itself (what
# fingerprinting rules were matched? did we parameterize the message? etc.). For the specific
# data stored, see the class definitions of the `HashingMetadata` subtypes.
hashing_metadata: models.Field[HashingMetadata | None, HashingMetadata | None] = JSONField(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this need to be repeated HashingMetadata | None, HashingMetadata | None?

Copy link
Member Author

@lobsterkatie lobsterkatie Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lobsterkatie lobsterkatie force-pushed the kmclb-add-grouping-component-subclasses branch from b9eb3d7 to a35e40f Compare November 14, 2024 23:29
@lobsterkatie lobsterkatie force-pushed the kmclb-add-hashing-metadata-types-and-field branch from 9fa314a to d8bd096 Compare November 14, 2024 23:29
@lobsterkatie lobsterkatie force-pushed the kmclb-add-grouping-component-subclasses branch from a35e40f to ad48885 Compare November 15, 2024 00:47
@lobsterkatie lobsterkatie force-pushed the kmclb-add-hashing-metadata-types-and-field branch from d8bd096 to 17b2bb9 Compare November 15, 2024 00:47
@lobsterkatie lobsterkatie force-pushed the kmclb-add-grouping-component-subclasses branch from ad48885 to 391c240 Compare November 15, 2024 15:22
@lobsterkatie lobsterkatie force-pushed the kmclb-add-hashing-metadata-types-and-field branch from 17b2bb9 to bb4227a Compare November 15, 2024 15:23
@lobsterkatie lobsterkatie force-pushed the kmclb-add-grouping-component-subclasses branch from 391c240 to d91f58e Compare November 15, 2024 18:03
@lobsterkatie lobsterkatie force-pushed the kmclb-add-hashing-metadata-types-and-field branch from bb4227a to 6ff5709 Compare November 15, 2024 18:03
Base automatically changed from kmclb-add-grouping-component-subclasses to master November 15, 2024 18:40
@lobsterkatie lobsterkatie force-pushed the kmclb-add-hashing-metadata-types-and-field branch from 6ff5709 to f777dc8 Compare November 15, 2024 19:19
@lobsterkatie lobsterkatie force-pushed the kmclb-add-hashing-metadata-types-and-field branch from f777dc8 to b9e323d Compare November 15, 2024 21:08
@lobsterkatie lobsterkatie merged commit 46db7b3 into master Nov 15, 2024
50 checks passed
@lobsterkatie lobsterkatie deleted the kmclb-add-hashing-metadata-types-and-field branch November 15, 2024 22:01
lobsterkatie added a commit that referenced this pull request Nov 18, 2024
…Metadata` (#80534)

This is a follow-up to #80531 (which added a new `hashing_metaadata` field to the `GroupHashMetadata` table), adding code to actual store data in the new field.
@github-actions github-actions bot locked and limited conversation to collaborators Dec 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants