Wrapper script for symbolizing addresses #16094

dschuff · 2022-01-22T02:12:30Z

There are lots of use cases for getting the line number and/or symbol name from a code address, and there are several existing ways it can be done:

Given a wasm (or object) file with a DWARF .debug_line section, llvm-symbolizer can get symbol and file/line information
Given a wasm file and a source map, we can get file/line information from the source map (I don't think we currently have a sourcemap parser that goes in this direction, we'd want to add one)
Given a wasm file with a name section, we can get symbol information for code addresses (but not line information, and not for data addresses)
Given a wasm file and an emscripten symbol map, we can get symbol information for code addresses (but not file/line information, and not for data addreses)
(IIRC) Given an object file (or a wasm file with a symbol table) llvm-nm can get symbol information (but not file/line information)

Given that 3 is wasm-specific, and 2 and 4 are emscripten-specific, it might be warranted to have an emscripten-specific wrapper that can just figure out which, if any, of these sources of information are available, and print the information.

Some of these (at least the LLVM-based ones) currently require section offsets rather than the file offsets printed by the engines in stack traces and the like. That's an orthogonal problem to this one (e.g. we might want to fix the llvm tools and/or internal interfaces to use file offsets instead). But either way this script can do any necessary conversions, and we can adjust it if we make changes to LLVM, and it will have utility even aside from that.

This is a WIP for #16094 The first PR will be for item 1, using llvm-symbolizer with DWARF.

Emsymbolizer is a tool for symbolizing a binary, i.e. showing the file/line or symbol info for a code address. As described in #16094 there are several ways to do this with emscripten. The first PR is for item 1, using llvm-symbolizer with DWARF.

sbc100 · 2022-08-15T18:29:42Z

Can this be closed now that the tool exists?

dschuff · 2022-08-15T19:45:29Z

We currently have 1 and 2 implemented. 3 seems less important (because the browser understands name sections already) and I don't even know what an end-developer use case for 5 would be (mostly I just included it for completeness). But that does leave 3, which maybe we should have before we consider the tool initial-feature-complete?

dschuff · 2024-02-27T22:01:28Z

#21367 and llvm/llvm-project#82083 implement item 3.
Item 5 is also partially working: For object files, llvm-nm and llvm-objdump print symbol addresses for both code (as code section offsets) and data (as data section offsets). This is different from the behavior for linked files, where both are printed as file offsets; but more problematically these address spaces overlap because of wasm's harvard architecture, so llvm-symbolizer (and therefore emsymbolizer) doesn't handle them correctly.
I'm not sure how much of a problem that really is; emsymbolizer is mostly used for binary size attribution for linked files rather than to analyze object files, so I don't have a near-term plan to fix it.

The only other possible thing to add is item 4, emscripten symbol map support. I'm not sure how necessary this is given that source maps are strictly more powerful than symbol maps, and it's probably best for most (all?) users to use them instead; I don't know of any use cases where symbol maps would be better. So I also don't currently have a plan to implement that.

dschuff · 2024-02-27T22:02:38Z

I think I'm going to go ahead and close this. If someone wants to request more features for emsymbolizer, they can reopen this, or open a new bug if it's something not listed here.

dschuff added a commit that referenced this issue Jan 22, 2022

Add emsymbolizer

f357d9d

This is a WIP for #16094 The first PR will be for item 1, using llvm-symbolizer with DWARF.

dschuff mentioned this issue Jan 22, 2022

Add emsymbolizer #16095

Merged

dschuff added a commit that referenced this issue Jan 26, 2022

Add emsymbolizer

60e9900

This is a WIP for #16094 The first PR will be for item 1, using llvm-symbolizer with DWARF.

dschuff closed this as completed Feb 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Wrapper script for symbolizing addresses #16094

Wrapper script for symbolizing addresses #16094

dschuff commented Jan 22, 2022

sbc100 commented Aug 15, 2022

Uh oh!

dschuff commented Aug 15, 2022

Uh oh!

dschuff commented Feb 27, 2024

Uh oh!

dschuff commented Feb 27, 2024

Uh oh!

Wrapper script for symbolizing addresses #16094

Wrapper script for symbolizing addresses #16094

Comments

dschuff commented Jan 22, 2022

sbc100 commented Aug 15, 2022

Uh oh!

dschuff commented Aug 15, 2022

Uh oh!

dschuff commented Feb 27, 2024

Uh oh!

dschuff commented Feb 27, 2024

Uh oh!