-
-
Notifications
You must be signed in to change notification settings - Fork 452
Enable symbolication of native stack frames in ANR events #4061
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable symbolication of native stack frames in ANR events #4061
Conversation
eebfcdf
to
b1f58bf
Compare
@lealanko-rt thanks for opening this up, that's amazing! We'll have a closer look at this asap. |
sentry-android-core/src/main/java/io/sentry/android/core/AnrV2Integration.java
Show resolved
Hide resolved
Hi, thanks for working on this. What is the workflow for PRs here? My own preference is to absorb all fixes into appropriate commits to produce a clean history, but I'm not sure if that's feasible with multiple people contributing to the branch... |
Sorry for the late reply. For |
Add a button to trigger ANR by holding a lock too long in native code. This can be used to test native stack frames in ANR events.
Handle offsets and deleted files, recognize "???" as a marker for unknown functions. Use named capturing groups for better readability and editability.
Use the "native" attribute of stack frames to indicate JNI invocation frames, like SentryStackTraceFactory does.
The images are parsed from the build ids and filenames in the thread dump's stack frames.
The instruction addresses of native stack frames in thread dumps are relative to the image file from which the code is loaded, and there are no absolute mapping addresses of images available. So explicitly inform the Sentry server about the correct images by using a relative "addr_mode" attribute. Also add the attribute to the SentryStackFrame class since it was not yet supported by it. The field documentation is converted from event.schema.json in the sentry server repo.
e1ada90
to
cb6f40e
Compare
...-android-core/src/main/java/io/sentry/android/core/internal/threaddump/ThreadDumpParser.java
Outdated
Show resolved
Hide resolved
…nal/threaddump/ThreadDumpParser.java
...-android-core/src/main/java/io/sentry/android/core/internal/threaddump/ThreadDumpParser.java
Outdated
Show resolved
Hide resolved
…nal/threaddump/ThreadDumpParser.java
final String debugId = buildId == null ? null : buildIdToDebugId(buildId); | ||
if (debugId != null) { | ||
if (!debugImages.containsKey(debugId)) { | ||
final DebugImage debugImage = new DebugImage(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markushi I'm reading our dev docs which say that image_addr
and image_size
are required, but apparently we're not setting them here and symbolicator is still able to symbolicate the frames. Shall we amend the docs? Or is there a possibility of wrong symbolication given we don't have all the sufficient information here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I think that's fine because
1.) the native dumps doesn't seem to even provide this information (which makes sense, as the state is not in memory anymore)
2.) we set the address mode to relative for every frame here as well:
frame.setAddrMode("rel:" + debugId);
(link)
Maybe @loewenheim could chime in here as an expert? If all frame addresses are relative, is it fine to provide no image_addr
/ image_size
in debug meta?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Image size is always optional (we just conservatively assume it extends to the beginning of the next image if it's not set), and the base address is only required for the absolute address mode. As you say, it's fine to leave it out with the relative address mode.
I'm a bit confused about this, though:
we set the address mode to relative for every frame here as well:
frame.setAddrMode("rel:" + debugId); (link)
The argument of rel
in Symbolicator is not a debug ID, it's the index of the module in the debug_meta/images
list in the event.
https://github.com/getsentry/symbolicator/blob/2e229b8470fabc5051677ef07d1e77217f921bd2/crates/symbolicator-native/src/interface.rs#L548-L556
Is this fixed up at some later point where the debug ID is replaced by an index? Otherwise this won't work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I tell a lie, this is taken care of in sentry
: https://github.com/getsentry/sentry/blob/f242899987f9645bd629b80ef2c86361e53819d4/src/sentry/lang/native/processing.py#L371-L389
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alright @loewenheim thanks for clearing this out! Do you think we should update the dev docs to specify those are not required in the case of relative mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would say so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lealanko-rt LGTM, thanks for this PR. While it looks like not much has been changed, it has massive impact and we appreciate your contribution! This is dope.
@lealanko-rt I've merged the PR, it'll be available with the next version (likely 8.5.0). Thanks for your contribution again! |
📜 Description
Add instruction address information and debug image information to AnrV2 events. This enables symbolication of native stack frames in stripped libraries.
Before
After
💡 Motivation and Context
Earlier, it was nearly impossible to debug ANRs caused by native code in stripped libraries, since no address information was passed in the events. The introduction of
io.sentry.anr.attach-thread-dumps
(#3791) at least made the raw data available for manual investigation, but that was toilsome.Symbolication of a stack frame requires an instruction address and a way of identifying a debug image. Both of these are available in the Android stack dumps that AnrV2 reads, the latter in the form of a BuildId. By sending this data and annotating the stack frames correctly, the Sentry server is able to symbolicate them.
💚 How did you test it?
Unit tests were modified to test all added features. Moreover, a button for triggering an ANR via native code was added to the sample Android application. The results can be seen in the screenshots above.
📝 Checklist
sendDefaultPII
is enabled.🔮 Next steps
Functionally this is pretty complete, I think. The rendering for managed frames in a hybrid stack (as seen in the second screenshot) is a bit ugly, and omits the module (i.e. package+class), so that could perhaps be improved a little.
A natural continuation of this work would be tackling #3295, which is fundamentally about the same thing, but needs to deserialize a protobuf tombstone instead of a textual dump, so there is not much code to be reused.