Source maps (Refs #1252, Reopens #1267) #1274

int3 · 2013-06-10T06:05:24Z

em++ -g tests/exceptions/typed.cpp -o typed.html should give an effective demonstration of source maps at work.

I'm not entirely sure how to fit a test for the HTML generation portion into the test suite. However, the code for that part is fairly simple, so a test may not be absolutely necessary...

kripken · 2013-06-10T19:20:11Z

Let's add an html file in the test directory. We can't do automatic testing of source maps actually working, but it would hugely help manual bisecting in the future, if all we need to do to verify things manually are to open a web page.

int3 · 2013-06-10T21:54:28Z

Test added.

kripken · 2013-06-17T20:57:52Z

Are you still working on this branch? (The last commit looks like a wip.)

int3 · 2013-06-17T21:03:28Z

Oops, the last two commits were not meant to be pushed. I've unpushed them now... could you have a look to see if this branch now works for you (with Chrome 28)? It'd be nice to have this merged in, and then we can handle optimized source maps later on.

(Should work in Nightly too as long as the path passed to emcc isn't an absolute path; that should work in a few days though, since fitzgen has already pushed a fix to devtools.)

kripken · 2013-06-18T00:23:50Z

The test doesn't pass for me,

$ ./tests/runner.py test_source_map
Running Emscripten tests...
test_source_map (__main__.default) ... (checking sanity from test runner)
INFO     root: (Emscripten: Running sanity checks)
Unable to find original file for /tmp/emscripten_test_default_GR0OER/src.cpp at /tmp/emscripten_test_default_GR0OER/tmp/emscripten_test_default_GR0OER/src.cpp

node.js:201
        throw e; // process.nextTick error, or 'error' event on first tick
              ^
TypeError: Object #<Object> has no method 'appendFileSync'
    at generateMap (/home/alon/Dev/emscripten/tools/sourcemapper.js:105:8)
    at Object.<anonymous> (/home/alon/Dev/emscripten/tools/sourcemapper.js:119:5)
    at Module._compile (module.js:441:26)
    at Object..js (module.js:459:10)
    at Module.load (module.js:348:32)
    at Function._load (module.js:308:12)
    at Array.0 (module.js:479:10)
    at EventEmitter._tickCallback (node.js:192:41)
ERROR

======================================================================
ERROR: test_source_map (__main__.default)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./tests/runner.py", line 9541, in test_source_map
    self.build(src, dirname, src_filename, post_build=(None,post))
  File "./tests/runner.py", line 253, in build
    self.ll_to_js(filename, extra_emscripten_args, post_build)
  File "./tests/runner.py", line 196, in ll_to_js
    if post2: post2(filename + '.o.js')
  File "./tests/runner.py", line 9530, in post
    self.assertIdentical(src, data['sourcesContent'][0])
KeyError: 'sourcesContent'

----------------------------------------------------------------------
Ran 1 test in 0.783s

FAILED (errors=1)

kripken · 2013-06-18T18:24:49Z

Ok, I updated node, now I see

$ ./tests/runner.py test_source_map
Running Emscripten tests...
test_source_map (__main__.default) ... (checking sanity from test runner)
INFO     root: (Emscripten: Running sanity checks)
Unable to find original file for /tmp/emscripten_test_default_KG97MX/src.cpp at /tmp/emscripten_test_default_KG97MX/tmp/emscripten_test_default_KG97MX/src.cpp
ERROR

======================================================================
ERROR: test_source_map (__main__.default)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./tests/runner.py", line 9541, in test_source_map
    self.build(src, dirname, src_filename, post_build=(None,post))
  File "./tests/runner.py", line 253, in build
    self.ll_to_js(filename, extra_emscripten_args, post_build)
  File "./tests/runner.py", line 196, in ll_to_js
    if post2: post2(filename + '.o.js')
  File "./tests/runner.py", line 9530, in post
    self.assertIdentical(src, data['sourcesContent'][0])
KeyError: 'sourcesContent'

----------------------------------------------------------------------
Ran 1 test in 0.727s

FAILED (errors=1)
alon@mozilla-XPS-8300:~/Dev/emscripten$

Tweak behavior of post_build; the `post2` hook now expects a function.

Chrome and FF seem to be able to handle the extra empty lines, but no harm making things more accurate. One thing less to worry about when debugging.

This makes it easier to reason about the remainder...

int3 · 2013-06-19T21:29:22Z

I've rebased the entire branch on incoming, and I've added support for source maps for optimized builds. Basically I hacked the uglify code generator to emit // @line 123 comments (the parser already carried this information.) I only did it for assignment and var declaration statements thus far; I should probably add stuff in for control flow condition checks, but that might be trickier (perhaps block comments instead of line comments need to be used). Nonetheless, the current implementation should be able to map a large fraction of lines, and will be accurate under AST transforms.

(Also, as discussed offline, the error messages in the above two comments were caused by my usage of newer Node features. We've now upgraded the minimum required version to v0.8, so that should be fine now.)

Edit: Hunting down some possible line number inaccuracies at the moment.

int3 · 2013-06-20T06:13:10Z

Update on current status of the 'inaccuracies': It turns out that we don't parse the LLVM line numbers correctly if the instruction spans more than two lines. In particular,

invoke void @__cxa_throw(i8* %exception, i8* bitcast ({ i8*, i8*, i32, i8* }* @_ZTIP4Test to i8*), i8* null) noreturn
        to label %unreachable unwind label %lpad3, !dbg !30 ; [debug line = 12:0]

__cxa_throw ends up being unmapped in the final source. In some cases this works out fine because __cxa_allocate usually appears as the previous instruction, and it usually shares the same line number in the original source, and browsers match unmapped lines with the closest previously mapped line. However code can be inlined between __cxa_allocate and __cxa_throw, and this results in rather wrong results.

Assuming that line number metadata is always added to the end of the instruction, we might be able to fix this simply by assigning every LLVM instruction a line number based on the line where it ends, rather than the line where it begins.

kripken · 2013-06-20T17:17:35Z

Interesting, but I am not too worried about slightly incorrect llvm debug info - optimizations like inlining often make debugging trickier anyhow ;)

I am more concerned with our making sure that there are no regressions on the test suite with such a change to uglify, and also no slowdowns as well, since this is a bug chunk of our compilation time. We should measure a few of the larger tests like poppler before and after this change.

kripken · 2013-06-22T00:51:29Z

(i see you just added more code, please comment in the bug here to ping me when this is ready for review)

Line numbering of exceptions now work (better) in optimized mode.

NodeWithToken is a string-like type produced by the parser during 'embed tokens' mode, which allows us to track line numbers.

Makes for an easier-to-review pull request.

* Now works on FF Nightly * Test does not run unless explicitly requested

kripken · 2013-06-25T01:40:53Z

So to be sure I follow, does the isArray change get rid of half the previous slowdown? And the rest is == vs ===, which we still suffer from, but it was countered by optimizing other uses of ==/===?

int3 · 2013-06-25T02:07:11Z

Yeah, the isArray change in calculating relevant was responsible for about half of the slowdown. That was the only single-line change that I profiled on its own. The rest of the speedups were obtained after changing == to === and using isArray to determine whether the result returned by pre was a new node.

kripken · 2013-06-25T16:49:31Z

Ok, then I am not sure what the change is that you said you prefer to not do. Can you elaborate a little more on what it would look like, and what speedup we could expect from it?

int3 · 2013-06-25T20:00:13Z

Right now in non-debug mode our tokens are of the form

ast = ['num', 1]

In debug mode they get replaced by

ast = [NodeWithToken('num', {start: {line: 10, column: 20}, end: {line: ..., column: ...}}), 1]

So we have to compare node[0] == 'num' instead of node[0] === 'num' because the former invokes toString on the operands.

If instead we write a pass that does something like

ast.start = ast[0].start
ast.end = ast[0].end
ast[0] = ast[0].toString()

Then we would be able to use strict comparison on ast[0].

All that said, I'm not entirely confident that we would see a significant increase. Yesterday I converted the other equals operators in js-optimizer.js to strict comparison where possible, and didn't see a measurable difference. I suspect my addition of isArray at those two points in traverse had the most impact on performance, because they reduced branching in the most-invoked function.

kripken · 2013-06-25T23:12:07Z

What is performance like when you revert the isArray optimization? That should tell us how much the rest is.

int3 · 2013-06-26T00:17:24Z

I did one better and replaced all the == with === for the type comparisons. (Even without the extra hacky pass I described above, this is perfectly fine as long as we're not trying to make source maps.) Compilation for sqlite went from 16.78 to 13.68 (!). That's about an 18% reduction. (I did a diff of the output to make sure we weren't accidentally missing out an entire pass or something...) Well, it does seem worth it to write that extra pass.

kripken · 2013-06-26T00:52:08Z

Thanks! Yeah, I really don't think we can afford to give up an 18% speedup. And the extra pass would only be done on debug builds?

int3 · 2013-06-26T00:55:07Z

Yup, the extra pass would only be needed if source maps are being produced.

This allows us to use strict comparisons on the node type.

int3 · 2013-06-26T08:57:58Z

Turns out it wasn't so difficult after all. I'd just overestimated the difficulty after a week debugging all those edge cases. Anyway, instead of writing an extra pass I modified the parser slightly to add the line numbers to the nodes themselves. Running the tests now; will let you know their results tomorrow morning.

Edit: asm2g passed.

This allows us to preserve more line numbers when debugging.

kripken · 2013-06-26T16:38:33Z

Great! What is performance like now?

int3 · 2013-06-26T16:48:06Z

I got about the same performance on sqlite that I mentioned previously -- ~13.68s. I haven't re-measured poppler yet.

kripken · 2013-06-26T16:49:24Z

Ok, great. I'll do some testing too.

kripken · 2013-06-26T17:52:36Z

There is a small conflict now in this branch, I fixed a js optimizer crash yesterday. trivial though

kripken · 2013-06-26T18:01:31Z

I'm running tests on this now, no need to rebase it for that conflict.

kripken · 2013-06-26T18:36:21Z

tools/js-optimizer.js

@@ -2031,6 +2035,7 @@ function eliminate(ast, memSafe) {
    // examine body and note locals
    var hasSwitch = false;
    traverse(func, function(node, type) {
+      if (debug && type) type = type.toString();


why is this needed here and nowhere else?

kripken · 2013-06-26T18:43:27Z

I am about to merge this. But as followups, I had one question above, and also: I still see two isArray checks, are they still needed/helpful?

kripken · 2013-06-26T19:23:40Z

Merged, thanks! Let's not forget the two followups though.

One other thing i noticed was not addressed was changing --debug to be consistent with the current format. Was there a reason to not do that, or just forgotten?

int3 · 2013-06-26T23:40:02Z

I profiled the isArray stuff and it looks like using the typeof checks (node !== null && typeof node === 'object') is a 3% win on sqlite -- 15.27 versus 14.84. Using isArray is more semantically accurate though, the typeof check takes a short cut and only verifies that we have an object, because all objects we receive are arrays.

The if (debug …) checks were leftover cruft.

The reason for the --debug flag was addressed in this comment. Let me know if you would still like me to change it.

Thanks for reviewing and merging!

kripken · 2013-07-08T18:01:02Z

Source mapping mentions "unable to find original file" on some things in the test suite, like ./tests/runner.py test_zlib - any idea why?

int3 · 2013-07-08T21:05:26Z

I think what is happening is that the static libraries are getting built in some directory, then the resulting bitcode files are getting copied over into a different directory before being linked. The sourcemapper script tries to look for the original source files at link time and thus cannot find them.

~~One possible solution is to always write out absolute paths in the // @line comments.~~ Nevermind, this wouldn't work if the bitcode files are the ones being copied...

int3 · 2013-07-08T22:07:43Z

Yeah, I don't think there's a great way to fix this. Ideally we'd want clang to output absolute paths in its LLVM debug info, but the only time it does that is if we pass in an absolute source path when invoking it. We could have a post-bitcode-compilation pass that rewrote the bitcode output to use absolute paths, but that seems like too much work.

It looks like we're generating source maps when testing default because -O0 sets debug_level to 4. How about making -O0 correspond to -g3 instead? This would turn off the annoying error messages and make our tests run more quickly (because they can use more than 1 core.)

kripken · 2013-07-15T17:24:02Z

Ok, got it.

Makes sense about -O0, but we do want for emcc -g src.cpp to emit full debug info. Not sure what the best compromise here is, one core is indeed not great in the test suite.

int3 added 13 commits June 19, 2013 01:22

Commit the source-map library.

86e672a

Implement basic source maps. Closes emscripten-core#1252.

747f74e

Add check for keep_llvm_debug + minor touchups.

165befa

Add test for source maps.

2f16717

Tweak behavior of post_build; the `post2` hook now expects a function.

Ensure line numbers are the same in HTML.

2c0dcd9

Chrome and FF seem to be able to handle the extra empty lines, but no harm making things more accurate. One thing less to worry about when debugging.

Fix source map line numbering for inline scripts.

af52a9a

Add test for HTML source map generation.

be5d45f

Lay the groundwork for optimized source maps.

7656f94

Delete uglify code that was only needed for compression.

0c3cb93

This makes it easier to reason about the remainder...

Map source lines for assignment statements.

0f73e28

Move source mapping files into their own folder.

54a59be

Update the source-map library.

050335f

Implement source maps for optimized builds.

d680f5e

Make sure the line numbers sync up.

486413c

int3 added 6 commits June 22, 2013 01:23

Source maps should only be activated via the --map flag.

cfe4eb5

Fix line numbering for invoke instructions.

5459b35

Line numbering of exceptions now work (better) in optimized mode.

Make test_debug actually fail when things go wrong.

99fa57e

Make optimizer handle both strings and string-like type objects.

7212198

NodeWithToken is a string-like type produced by the parser during 'embed tokens' mode, which allows us to track line numbers.

Put uglify back the way it was.

4eef4be

Makes for an easier-to-review pull request.

Improvements to test_html_source_map.

d50e7b4

* Now works on FF Nightly * Test does not run unless explicitly requested

Get rid of stale comment.

0c19e40

Move line numbers to the AST node itself.

5664470

This allows us to use strict comparisons on the node type.

int3 added 3 commits June 26, 2013 02:44

Strict compare all the things!

e3a37fc

We can just compare for strings here.

e08502d

Reuse nodes where possible.

5383aa8

This allows us to preserve more line numbers when debugging.

kripken reviewed Jun 26, 2013
View reviewed changes

kripken merged commit 5383aa8 into emscripten-core:incoming Jun 26, 2013

int3 deleted the source-maps branch July 8, 2013 21:19

gagern mentioned this pull request Jun 9, 2015

Update uglify #3523

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Source maps (Refs #1252, Reopens #1267) #1274

Source maps (Refs #1252, Reopens #1267) #1274

int3 commented Jun 10, 2013

kripken commented Jun 10, 2013

int3 commented Jun 10, 2013

kripken commented Jun 17, 2013

int3 commented Jun 17, 2013

kripken commented Jun 18, 2013

kripken commented Jun 18, 2013

int3 commented Jun 19, 2013

int3 commented Jun 20, 2013

kripken commented Jun 20, 2013

kripken commented Jun 22, 2013

kripken commented Jun 25, 2013

int3 commented Jun 25, 2013

kripken commented Jun 25, 2013

int3 commented Jun 25, 2013

kripken commented Jun 25, 2013

int3 commented Jun 26, 2013

kripken commented Jun 26, 2013

int3 commented Jun 26, 2013

int3 commented Jun 26, 2013

kripken commented Jun 26, 2013

int3 commented Jun 26, 2013

kripken commented Jun 26, 2013

kripken commented Jun 26, 2013

kripken commented Jun 26, 2013

kripken Jun 26, 2013

kripken commented Jun 26, 2013

kripken commented Jun 26, 2013

int3 commented Jun 26, 2013

kripken commented Jul 8, 2013

int3 commented Jul 8, 2013

int3 commented Jul 8, 2013

kripken commented Jul 15, 2013

Source maps (Refs #1252, Reopens #1267) #1274

Source maps (Refs #1252, Reopens #1267) #1274

Conversation

int3 commented Jun 10, 2013

kripken commented Jun 10, 2013

int3 commented Jun 10, 2013

kripken commented Jun 17, 2013

int3 commented Jun 17, 2013

kripken commented Jun 18, 2013

kripken commented Jun 18, 2013

int3 commented Jun 19, 2013

int3 commented Jun 20, 2013

kripken commented Jun 20, 2013

kripken commented Jun 22, 2013

kripken commented Jun 25, 2013

int3 commented Jun 25, 2013

kripken commented Jun 25, 2013

int3 commented Jun 25, 2013

kripken commented Jun 25, 2013

int3 commented Jun 26, 2013

kripken commented Jun 26, 2013

int3 commented Jun 26, 2013

int3 commented Jun 26, 2013

kripken commented Jun 26, 2013

int3 commented Jun 26, 2013

kripken commented Jun 26, 2013

kripken commented Jun 26, 2013

kripken commented Jun 26, 2013

kripken Jun 26, 2013

Choose a reason for hiding this comment

kripken commented Jun 26, 2013

kripken commented Jun 26, 2013

int3 commented Jun 26, 2013

kripken commented Jul 8, 2013

int3 commented Jul 8, 2013

int3 commented Jul 8, 2013

kripken commented Jul 15, 2013