Add emsymbolizer #16095

dschuff · 2022-01-22T02:17:38Z

Emsymbolizer is a tool for symbolizing a binary, i.e. showing the file/line or symbol info for a code address.
As described in #16094 there are several ways to do this with emscripten.
The first PR is for item 1, using llvm-symbolizer with DWARF.

tools/webassembly.py

This is a WIP for #16094 The first PR will be for item 1, using llvm-symbolizer with DWARF.

dschuff · 2022-01-26T22:44:11Z

OK, PTAL. On a scale of 🤔 to 🤮, what do you think of the test?

kripken

Code lgtm. For the test, yeah, that seems risky to be brittle... maybe we can add slack, see comment.

kripken · 2022-01-26T22:48:14Z

tests/test_other.py

+          stdout=PIPE).stdout
+
+    # Check a location in foo(), not inlined.
+    self.assertIn('test_dwarf.c:6:3', get_addr('0x101'))


We could do a range here perhaps? 0x101 += 10 seems safe and good enough.

The problem is that this test is the other way around... i.e. we enter the address and check the line number. Come to think of it, it actually does have some slack in the positive direction, since the line record covers 5 bytes of instructions.

The thing that made me worry a little less is that the code looks pretty minimal and not too subject to random changes in e.g. optimizers. Reordering of functions, removal of __wasm_call_ctors, changes in imports/exports etc could still break it.

Interesting, what's this about 5 bytes? Is the line section limited to that resolution?

What I was suggesting is something like this, can't it work?

for i in range(-10, 10): if 'test_dwarf.c:6:3' in get_addr(str(0x101 + i)): break else: self.assert('not found')

There are just 5 bytes of instructions that are considered to be part of line 6 (the call, and the drop of the return value).
I guess that suggestion could work. Although I feel like that would cover small perturbations in the code generation, but probably not changes like the ones I mentioned above (well, I guess it would survive removing or moving __wasm_call_ctors). Another option would be to disassemble the binary and find particular instructions, which would be precise but is potentially almost as error-prone as the code.

Disassembling sounds like overkill to me. Sgtm to land this and see if it's an issue in practice.

OK, I expect I'll be thinking more about this as I write more tests for the various ways to get line/symbol info.

dschuff · 2022-01-27T00:54:03Z

Weird, all the tests passed but the status doesn't seem to have propagated to GH. I'm just going to merge manually.

sbc100 · 2022-01-27T02:07:04Z

tests/test_other.py

+
+    def get_addr(address):
+      return self.run_process(
+          [PYTHON, path_from_root('emsymbolizer.py'), 'test_dwarf.wasm', address],


Can you add this to tools/create_entry_points and then run it so that you can avoid calling via python like this?

sbc100 reviewed Jan 22, 2022

View reviewed changes

tools/webassembly.py Outdated Show resolved Hide resolved

dschuff added 2 commits January 25, 2022 15:34

Add emsymbolizer

60e9900

This is a WIP for #16094 The first PR will be for item 1, using llvm-symbolizer with DWARF.

address suggestion, add a test

c953a49

dschuff force-pushed the emsymbolize branch from f357d9d to c953a49 Compare January 26, 2022 02:00

dschuff added 2 commits January 25, 2022 18:03

flake8

a5d6634

clean up comments

e22164b

dschuff marked this pull request as ready for review January 26, 2022 22:43

dschuff requested review from sbc100 and kripken January 26, 2022 22:43

flake8 again

90d2b17

kripken approved these changes Jan 26, 2022

View reviewed changes

dschuff enabled auto-merge (squash) January 27, 2022 00:36

dschuff disabled auto-merge January 27, 2022 00:54

dschuff merged commit db77c76 into main Jan 27, 2022

dschuff deleted the emsymbolize branch January 27, 2022 00:54

sbc100 reviewed Jan 27, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add emsymbolizer #16095

Add emsymbolizer #16095

Uh oh!

dschuff commented Jan 22, 2022 •

edited

Loading

Uh oh!

Uh oh!

dschuff commented Jan 26, 2022

Uh oh!

kripken left a comment

Uh oh!

kripken Jan 26, 2022

Uh oh!

dschuff Jan 26, 2022

Uh oh!

dschuff Jan 26, 2022

Uh oh!

kripken Jan 26, 2022

Uh oh!

dschuff Jan 26, 2022

Uh oh!

kripken Jan 27, 2022

Uh oh!

dschuff Jan 27, 2022

Uh oh!

dschuff commented Jan 27, 2022

Uh oh!

sbc100 Jan 27, 2022

Uh oh!

Uh oh!

Add emsymbolizer #16095

Add emsymbolizer #16095

Uh oh!

Conversation

dschuff commented Jan 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

dschuff commented Jan 26, 2022

Uh oh!

kripken left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dschuff commented Jan 27, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dschuff commented Jan 22, 2022 •

edited

Loading