Skip to content

Safari: TextDecoder refuses to process data after 2GB causing RangeError (Bad value) and Out of bounds memory access #4471

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bes opened this issue Apr 6, 2025 · 4 comments · May be fixed by #4472
Labels

Comments

@bes
Copy link

bes commented Apr 6, 2025

Describe the Bug

After processing 2GB of data, Safari's TextDecoder refuses to produce any more strings, apparently some sort of (successful) security mitigation that had unintended consequences.

We are seeing lots (thousands per week) of crashes in Sentry from Safari versions 16 up to the latest 18.4, with crashes saying Out of bounds memory access and RangeError(BadValue).

There is a bug report on the WebKit bugzilla, but it is unlikely to get fixed for older Safari versions (and they don't seem to be in a hurry to fix it, was reported 2024-09).

I initially started another bug report on WebKit bugzilla, but am now reasonably convinced that the TextDecoder is the culprit for our issues too.

Steps to Reproduce

  1. Open any Safari version 16 - 18.4
  2. Open the Safari console on any site
  3. Run the provided script from the bug report
const text = new TextDecoder()
const buff = new ArrayBuffer(100)
for (let i = 0; i < 21474837; i++) {
    text.decode(buff)
}
  1. See that RangeError: Bad value is produced
    Image

When this happens in wasm-bindgen generated program (and it does happen for us, all the time) it's game over for that browser/tab instance.

Expected Behavior

I expect Safari to fix this at the source, but I think wasm-bindgen could provide a workaround for Safari specifically, creating a new TextDecoder when decoding is about to pass 2GB of data.

Actual Behavior

After processing 2GB of text, Safari's TextDecoder won't be able to decode anymore.

Additional Context

This isn't an optimal situation for anyone, but I hope that we can reach consensus on what needs to be done here.

wasm-bindgen generates the following code for me:

const lTextDecoder = typeof TextDecoder === 'undefined' ? (0, module.require)('util').TextDecoder : TextDecoder;

let cachedTextDecoder = new lTextDecoder('utf-8', { ignoreBOM: true, fatal: true });

function getStringFromWasm0(ptr, len) {
    ptr = ptr >>> 0;
    return cachedTextDecoder.decode(getUint8ArrayMemory0().subarray(ptr, ptr + len));
}

Some possible workarounds:

  • For every xx_wasm_bg.js create a function to retrieve a TextDecoder, keeping track of the amount of decoded bytes, creating new TextDecoders as needed - new decoder for Safari, keep the same one for other browsers
  • Same as above but no browser checking
  • Make any of the above optional (new command line argument)
  • Instead of providing this for everyone, make it easier to patch this code by some mechanism in wasm-bindgen for robustness (instead of monkey patching)
@bes bes added the bug label Apr 6, 2025
@bes
Copy link
Author

bes commented Apr 6, 2025

Here is a naive attempt

// Original code
const lTextDecoder = typeof TextDecoder === 'undefined' ? (0, module.require)('util').TextDecoder : TextDecoder;

// Original code
let cachedTextDecoder = new lTextDecoder('utf-8', { ignoreBOM: true, fatal: true });
// Original code
cachedTextDecoder.decode();

// New code
// Maximum number of bytes Safari can handle for one TextDecoder is 2GiB (2147483648) but empirically
// it seems to crash a bit before the end, so we remove 1MiB of margin
const MAX_SAFARI_DECODE_BYTES = 2147483648 - 1048576;

// New code
let numBytesDecoded = 0;
function decodeText(ptr, len) {
    numBytesDecoded += len;
   if (numBytesDecoded >= MAX_SAFARI_DECODE_BYTES) {
        console.log("New text decoder", numBytesDecoded);
        cachedTextDecoder = new lTextDecoder('utf-8', { ignoreBOM: true, fatal: true });
        cachedTextDecoder.decode();
        numBytesDecoded = len;
    }
    return cachedTextDecoder.decode(getUint8ArrayMemory0().subarray(ptr, ptr + len));
}

// Original code with one line modified
function getStringFromWasm0(ptr, len) {
    try {
        ptr = ptr >>> 0;
        // NOTE: This line is patched
        return decodeText(ptr, len);
    } catch (e) {
        console.error(e);
        debugger;
        throw e;
    }
}

@stephanemagnenat
Copy link

Thank you so much for working on this! I strongly suggest to have a work-around that does not need any manual user configuration or arcane command line options, because this bug just breaks applications after a certain time depending on usage and I cannot imagine someone would want that behavior. Even if the work-around is a bit ugly, I think it is way better than the status quo and we could remove it in some years, after Safari has patched the issue and some versions have passed.

@bes
Copy link
Author

bes commented Apr 6, 2025

@stephanemagnenat Thank you for the kind words. Working on a patch.

bes added a commit to bes/wasm-bindgen that referenced this issue Apr 6, 2025
… thrown

`TextDecoder` in Safari has a limitation that causes it to throw `RangeError`
after decoding more than 2GiB of data. This causes long running wasm programs
that need to use `TextDecoder` to crash and start throwing `RuntimeError` with
the message "Out of bounds memory access".

We work around the issue by tracking how much data has been decoded by any
given `TextDecoder`, and replace it when it comes close to 2GiB, deducting
a small margin of 1MiB which has been empirically shown to reduce the
likelihood of miscounting (for unknown reasons) causing a `RangeError` to
be thrown.

This commit also adds stricter handling of the kind of declaration used for
TextDecoder and TextEncoder - TextDecoder always uses let because it needs
to be mutable, and TextEncoder always uses const because it doesn't need to
be mutable.

Fixes rustwasm#4471
bes added a commit to bes/wasm-bindgen that referenced this issue Apr 7, 2025
… thrown

`TextDecoder` in Safari has a limitation that causes it to throw `RangeError`
after decoding more than 2GiB of data. This causes long running wasm programs
that need to use `TextDecoder` to crash and start throwing `RuntimeError` with
the message "Out of bounds memory access".

We work around the issue by tracking how much data has been decoded by any
given `TextDecoder`, and replace it when it comes close to 2GiB, deducting
a small margin of 1MiB which has been empirically shown to reduce the
likelihood of miscounting (for unknown reasons) causing a `RangeError` to
be thrown.

This commit also adds stricter handling of the kind of declaration used for
TextDecoder and TextEncoder - TextDecoder always uses let because it needs
to be mutable, and TextEncoder always uses const because it doesn't need to
be mutable.

Fixes rustwasm#4471
@bes
Copy link
Author

bes commented Apr 7, 2025

Fix for Safari proper is here WebKit/WebKit#43753 but it's going be a while before it is widely adopted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants