Skip to content

Commit ee31c97

Browse files
committed
Avoid octal escapes, use raw bytes instead
There was a bug where the hex-to-oct conversion would match \\x01. But support for octal escape sequences is optional in any case, and forbidden in strict mode, so we should avoid using these. As per the ECMAScript 5.1 spec, any source character (which may be any unicode code point) can be used inside a string literal, with the exception of backslash, line terminator or the quoting character. So we do just that: dump a lot of raw bytes into the string literal and escape only what needs to be escaped. There is one catch, though: sources are usually encoded in UTF-8, in which case we can't exactly plug in raw bytes, but have to use UTF-8 sequences for the range \x80 through \xff. This may cause problems if the source file is NOT interpreted as UTF-8.
1 parent fedf9c4 commit ee31c97

File tree

1 file changed

+5
-4
lines changed

1 file changed

+5
-4
lines changed

emcc

+5-4
Original file line numberDiff line numberDiff line change
@@ -1337,10 +1337,11 @@ try:
13371337
return '';
13381338
membytes = ''.join(map(chr, membytes))
13391339
if not memory_init_file:
1340-
s = repr(membytes)
1341-
hex_to_octal = lambda x: '\\%o' % int(x.group(1), 16)
1342-
s = re.sub(r'\\x([0-1][0-9A-Fa-f])(?:(?=[^0-9])|$)', hex_to_octal, s)
1343-
return 'var memoryInitializer = %s;' % s
1340+
s = membytes
1341+
s = s.replace('\\', '\\\\').replace("'", "\\'")
1342+
s = s.replace('\n', '\\n').replace('\r', '\\r')
1343+
s = s.decode('latin1').encode('utf8')
1344+
return "var memoryInitializer = '%s';" % s
13441345
open(memfile, 'wb').write(membytes)
13451346
if DEBUG:
13461347
# Copy into temp dir as well, so can be run there too

0 commit comments

Comments
 (0)