Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(derived_code_mappings): Support Java #86280

Merged
merged 22 commits into from
Mar 5, 2025

Conversation

armenzg
Copy link
Member

@armenzg armenzg commented Mar 4, 2025

It will run in dry run mode until it's ready.

@armenzg armenzg self-assigned this Mar 4, 2025
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Mar 4, 2025
@@ -15,17 +15,24 @@
# We only care about extensions of files which would show up in stacktraces after symbolication
SUPPORTED_EXTENSIONS = [
"clj",
"cljc",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AbhiPrasad @lcian does it look good to you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep.
I've found that for Groovy there's also a few more possible extensions: .gvy, .gy, .gsh even though this is not very important considering the usage of groovy etc.
Sorce: https://blog.mrhaki.com/2011/10/groovy-goodness-default-groovy-script.html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should split up this constant to categorize by language, makes it a bit easier to review.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intend to refactor this.

@@ -214,8 +238,12 @@ def _stacktrace_buckets(
buckets[frame_filename.stack_root].append(frame_filename)
except UnsupportedFrameInfo:
logger.warning("Frame's filepath not supported: %s", frame.get("filename"))
except MissingModuleOrAbsPath:
logger.warning("Do not panic. I'm collecting this data.")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once I review the data I will remove reporting it to Sentry.

@@ -507,8 +535,10 @@ def find_roots(frame_filename: FrameInfo, source_path: str) -> tuple[str, str]:
return (stack_root, "")
elif source_path.endswith(stack_path): # "Packaged" logic
source_prefix = source_path.rpartition(stack_path)[0]
package_dir = stack_path.split("/")[0]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Python, it has always been a single word representing the package name (e.g. sentry) while in Java it is multiple words (e.g. io.sentry.some_app_name), thus, we need to change it.

@armenzg armenzg marked this pull request as ready for review March 4, 2025 14:03
@armenzg armenzg requested review from a team as code owners March 4, 2025 14:03
Copy link

codecov bot commented Mar 4, 2025

Codecov Report

Attention: Patch coverage is 94.52055% with 4 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...try/issues/auto_source_code_config/code_mapping.py 86.66% 4 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           master   #86280    +/-   ##
========================================
  Coverage   87.88%   87.89%            
========================================
  Files        9721     9723     +2     
  Lines      551119   551254   +135     
  Branches    21478    21478            
========================================
+ Hits       484372   484503   +131     
- Misses      66394    66398     +4     
  Partials      353      353            

Copy link
Member

@adinauer adinauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments about differences to the source context implementation (https://github.com/getsentry/symbolicator/blob/450f1d6a8c346405454505ed9ca87e08a6ff34b7/crates/symbolicator-proguard/src/symbolication.rs#L450-L485).

Doesn't mean they have to align 100% but some of them do make sense IMO.

Could we simply err on the side of trying too many strings that could work and worst case not find anything in the repo vs. not trying for something that could be found?

extension = parts[1]

parts = module.split(".")
if len(parts) <= 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For source context afair we simply don't use module if it does not contain a .

temp_path = None
# Find the first uppercase letter after a period to identify class name
for i, part in enumerate(parts):
if part and part[0].isupper():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For source context we simply use everything before the last . instead of looking for upper case letters. You can also use lowercase letters for classes - not sure how common that is tho. You can also use upper case letters for package names.

Copy link
Member

@AbhiPrasad AbhiPrasad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ on the newly added extensions on my side!

@@ -15,17 +15,24 @@
# We only care about extensions of files which would show up in stacktraces after symbolication
SUPPORTED_EXTENSIONS = [
"clj",
"cljc",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should split up this constant to categorize by language, makes it a bit easier to review.

@armenzg armenzg requested a review from MichaelSun48 March 4, 2025 18:38
Comment on lines +538 to +540
f"{stack_root}{frame_filename.stack_root}/",
f"{source_prefix}{frame_filename.stack_root}/",
)
Copy link
Member

@MichaelSun48 MichaelSun48 Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any change to find_roots makes me nervous since this function effectively calculates the code mapping and was very sensitive to small changes, at least in my experience.

If it passes the tests, I think we should be fine, but I would keep a very close eye on related metrics after this PR lands.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been making sure the tests are sound. It has not required any changes to the Python packages.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Apologies if I'm being a little overly paranoid 😅

Copy link
Member

@adinauer adinauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM.

I'm uncertain how you're handling file extensions and the lack thereof.

If abs_path is used we trust the extension that is there.

But if only module is used, I would assume there's no file extension. Not sure how well that works with the rest of the code here. Just leaving this for you to decide.

raise DoesNotFollowJavaPackageNamingConvention

# If module has a dot, take everything before the last dot
# com.example.foo.Bar$InnerClass -> com/example/foo/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the code below doesn't have a trailing /. Maybe we should update the comment to avoid confusion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call; thanks!

# If module has a dot, take everything before the last dot
# com.example.foo.Bar$InnerClass -> com/example/foo/
stack_root = module.rsplit(".", 1)[0].replace(".", "/")
file_path = f"{stack_root}/{abs_path}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In symbolicator we use only the part of abs_path before the last .. But there we always append .jvm as fake file extension. I'm not entirely sure what works best here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would not work for us either.

I made up these two cases which are not going to help in deriving code mappings. However, we have many other cases that would help, thus, it's okay if these don't work.

            pytest.param(
                {"module": "foo.bar.Baz", "abs_path": "no_extension"},
                "foo/bar",
                "foo/bar/Baz",  # The path does not use the abs_path
                id="invalid_abs_path_no_extension",
            ),
            pytest.param(
                {"module": "foo.bar.Baz", "abs_path": "foo$bar"},
                "foo/bar",
                "foo/bar/Baz",  # The path does not use the abs_path
                id="invalid_abs_path_dollar_sign",
            ),

@armenzg armenzg merged commit b00b520 into master Mar 5, 2025
49 checks passed
@armenzg armenzg deleted the feat/java-support/auto_source/armenzg branch March 5, 2025 12:32
philipphofmann pushed a commit that referenced this pull request Mar 6, 2025
It will run in dry run mode until it's ready.
@github-actions github-actions bot locked and limited conversation to collaborators Mar 22, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants