-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
feat(related_issues): Trace connected errors #69237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #69237 +/- ##
==========================================
- Coverage 79.65% 79.48% -0.17%
==========================================
Files 6476 6477 +1
Lines 287490 287527 +37
Branches 49547 49554 +7
==========================================
- Hits 228990 228549 -441
- Misses 58132 58610 +478
Partials 368 368
|
Given an issue, if we can find an event with a trace, we may be able to find other issue in its trace. This can allow the customer to act on them together.
05a39bd
to
61497e4
Compare
61497e4
to
16914f5
Compare
auto_fields=False, | ||
), | ||
) | ||
error_query = find_errors_for_trace_id(params, trace_id, limit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to call find_errors_for_trace_id
from my endpoint, thus, making it its own function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer you re-implement this function rather than re-use here so that I can still make changes here without potentially breaking your endpoint.
Especially because the query pattern here is very dependent on feeding the latter augment_transactions_with_spans
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wfm
@@ -11,7 +11,10 @@ def same_root_cause_analysis(group: Group) -> list[int]: | |||
"""Analyze and create a group set if the group was caused by the same root cause.""" | |||
# Querying the data field (which is a GzippedDictField) cannot be done via | |||
# Django's ORM, thus, we do so via compare_groups | |||
project_groups = RangeQuerySetWrapper(Group.objects.filter(project=group.project_id), limit=100) | |||
project_groups = RangeQuerySetWrapper( | |||
Group.objects.filter(project=group.project_id).exclude(id=group.id), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
endpoint = "sentry-api-0-issues-related-issues" | ||
FEATURES: list[str] = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed in one of the test cases. Ideally, I fix it up there.
|
||
def setUp(self) -> None: | ||
super().setUp() | ||
self.login_as(user=self.user) | ||
self.organization = self.create_organization(owner=self.user) | ||
self.error_type = "ApiTimeoutError" | ||
self.error_value = "Timed out attempting to reach host: api.github.com" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated change. I could move it into its own PR.
query = QueryBuilder( | ||
Dataset.Events, | ||
{"start": start, "end": end, "organization_id": org_id, "project_id": project_ids}, | ||
query=f"trace:{event.trace_id}", | ||
selected_columns=["id", "issue.id"], | ||
# Don't add timestamp to this orderby as snuba will have to split the time range up and make multiple queries | ||
orderby=["id"], | ||
limit=100, | ||
config=QueryBuilderConfig(auto_fields=False), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems like potentially rather heavy query, maybe add some read-through cache?
org_id = group.project.organization_id | ||
# XXX: Test without a list and validate the data type | ||
project_ids = list(Project.objects.filter(organization_id=org_id).values_list("id", flat=True)) | ||
start, end = default_start_end_dates() # Today to 90 days back |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to Bartek's comment here, this is a giant query if its spanning multiple projects and ninety days, I'm not familiar with what you're building, but do you really need to scan 90d of data? Traces rarely span more than a few minutes, and even more rarely span multiple days.
That is to say, you have an event do you know that events timestamp? can you use that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will follow up.
We can just use the timestamp of the event as the start date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that a day in either direction seems like it would cover 99% of useful cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd want to see the trace connected issues for downstream issues as well (e.g. the second issue in a trace). I think you could limit this safely to several hours either direction of the occurrence.
Given a group, we look for the recommended event and search for any other errors in the trace. These trace-connected groups will be shown under the Related Issues tab. See the current look (more UI changes will be needed). <img width="933" alt="image" src="https://github.com/getsentry/sentry/assets/44410/b3f774d7-dc71-40b9-b144-65f143b6e981">
Given a group, we look for the recommended event and search for any other errors in the trace. These trace-connected groups will be shown under the Related Issues tab. See the current look (more UI changes will be needed). <img width="933" alt="image" src="https://github.com/getsentry/sentry/assets/44410/b3f774d7-dc71-40b9-b144-65f143b6e981">
Suspect IssuesThis pull request was deployed and Sentry observed the following issues:
Did you find this useful? React with a 👍 or 👎 |
**NOTE:** This is not the final UI but a step towards it. This shows the recently added trace-connected errors. This is a follow-up to the backend work: #69237 
Given a group, we look for the recommended event and search for any other errors in the trace.
These trace-connected groups will be shown under the Related Issues tab. See the current look (more UI changes will be needed).