-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(derived_code_mappings): Support Java #86280
Conversation
For now, it will run as dry run until we're ready to enable it for a few organizations.
… section When doing local development I noticed some funkiness and it was due to this.
@@ -15,17 +15,24 @@ | |||
# We only care about extensions of files which would show up in stacktraces after symbolication | |||
SUPPORTED_EXTENSIONS = [ | |||
"clj", | |||
"cljc", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AbhiPrasad @lcian does it look good to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep.
I've found that for Groovy there's also a few more possible extensions: .gvy
, .gy
, .gsh
even though this is not very important considering the usage of groovy etc.
Sorce: https://blog.mrhaki.com/2011/10/groovy-goodness-default-groovy-script.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should split up this constant to categorize by language, makes it a bit easier to review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intend to refactor this.
@@ -214,8 +238,12 @@ def _stacktrace_buckets( | |||
buckets[frame_filename.stack_root].append(frame_filename) | |||
except UnsupportedFrameInfo: | |||
logger.warning("Frame's filepath not supported: %s", frame.get("filename")) | |||
except MissingModuleOrAbsPath: | |||
logger.warning("Do not panic. I'm collecting this data.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once I review the data I will remove reporting it to Sentry.
@@ -507,8 +535,10 @@ def find_roots(frame_filename: FrameInfo, source_path: str) -> tuple[str, str]: | |||
return (stack_root, "") | |||
elif source_path.endswith(stack_path): # "Packaged" logic | |||
source_prefix = source_path.rpartition(stack_path)[0] | |||
package_dir = stack_path.split("/")[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Python, it has always been a single word representing the package name (e.g. sentry
) while in Java it is multiple words (e.g. io.sentry.some_app_name
), thus, we need to change it.
Codecov ReportAttention: Patch coverage is ✅ All tests successful. No failed tests found.
Additional details and impacted files@@ Coverage Diff @@
## master #86280 +/- ##
========================================
Coverage 87.88% 87.89%
========================================
Files 9721 9723 +2
Lines 551119 551254 +135
Branches 21478 21478
========================================
+ Hits 484372 484503 +131
- Misses 66394 66398 +4
Partials 353 353 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments about differences to the source context implementation (https://github.com/getsentry/symbolicator/blob/450f1d6a8c346405454505ed9ca87e08a6ff34b7/crates/symbolicator-proguard/src/symbolication.rs#L450-L485).
Doesn't mean they have to align 100% but some of them do make sense IMO.
Could we simply err on the side of trying too many strings that could work and worst case not find anything in the repo vs. not trying for something that could be found?
extension = parts[1] | ||
|
||
parts = module.split(".") | ||
if len(parts) <= 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For source context afair we simply don't use module
if it does not contain a .
temp_path = None | ||
# Find the first uppercase letter after a period to identify class name | ||
for i, part in enumerate(parts): | ||
if part and part[0].isupper(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For source context we simply use everything before the last .
instead of looking for upper case letters. You can also use lowercase letters for classes - not sure how common that is tho. You can also use upper case letters for package names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ on the newly added extensions on my side!
@@ -15,17 +15,24 @@ | |||
# We only care about extensions of files which would show up in stacktraces after symbolication | |||
SUPPORTED_EXTENSIONS = [ | |||
"clj", | |||
"cljc", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should split up this constant to categorize by language, makes it a bit easier to review.
f"{stack_root}{frame_filename.stack_root}/", | ||
f"{source_prefix}{frame_filename.stack_root}/", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any change to find_roots
makes me nervous since this function effectively calculates the code mapping and was very sensitive to small changes, at least in my experience.
If it passes the tests, I think we should be fine, but I would keep a very close eye on related metrics after this PR lands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been making sure the tests are sound. It has not required any changes to the Python packages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Apologies if I'm being a little overly paranoid 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM.
I'm uncertain how you're handling file extensions and the lack thereof.
If abs_path
is used we trust the extension that is there.
But if only module
is used, I would assume there's no file extension. Not sure how well that works with the rest of the code here. Just leaving this for you to decide.
raise DoesNotFollowJavaPackageNamingConvention | ||
|
||
# If module has a dot, take everything before the last dot | ||
# com.example.foo.Bar$InnerClass -> com/example/foo/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the code below doesn't have a trailing /
. Maybe we should update the comment to avoid confusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call; thanks!
# If module has a dot, take everything before the last dot | ||
# com.example.foo.Bar$InnerClass -> com/example/foo/ | ||
stack_root = module.rsplit(".", 1)[0].replace(".", "/") | ||
file_path = f"{stack_root}/{abs_path}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In symbolicator we use only the part of abs_path
before the last .
. But there we always append .jvm
as fake file extension. I'm not entirely sure what works best here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would not work for us either.
I made up these two cases which are not going to help in deriving code mappings. However, we have many other cases that would help, thus, it's okay if these don't work.
pytest.param(
{"module": "foo.bar.Baz", "abs_path": "no_extension"},
"foo/bar",
"foo/bar/Baz", # The path does not use the abs_path
id="invalid_abs_path_no_extension",
),
pytest.param(
{"module": "foo.bar.Baz", "abs_path": "foo$bar"},
"foo/bar",
"foo/bar/Baz", # The path does not use the abs_path
id="invalid_abs_path_dollar_sign",
),
It will run in dry run mode until it's ready.
It will run in dry run mode until it's ready.