Skip to content

Run TLS destructors for wasm32-wasip1-threads make Node.js stuck #137510

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Brooooooklyn opened this issue Feb 24, 2025 · 20 comments
Open

Run TLS destructors for wasm32-wasip1-threads make Node.js stuck #137510

Brooooooklyn opened this issue Feb 24, 2025 · 20 comments
Labels
A-thread-locals Area: Thread local storage (TLS) C-bug Category: This is a bug. O-wasi Operating system: Wasi, Webassembly System Interface O-wasm Target: WASM (WebAssembly), http://webassembly.org/ T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@Brooooooklyn
Copy link

Brooooooklyn commented Feb 24, 2025

I tried this code:

https://github.com/Brooooooklyn/rust-wasi-thread-local-stuck

use std::cell::RefCell;
use std::collections::HashMap;

thread_local! {
  static THREAD_LOCAL_DATA: RefCell<HashMap<u32, u32>> = RefCell::new(HashMap::new());
}

#[unsafe(no_mangle)]
pub unsafe extern "C" fn set_thread_local_data(key: u32, value: u32) {
    THREAD_LOCAL_DATA.with(|data| {
        data.borrow_mut().insert(key, value);
    });
}

I expected to see this happen: Program run and exit normally

Instead, this happened: hang forever

(edit): See toyobayashi/emnapi#136 (comment) for where and why

Meta

rustc --version --verbose:

rustc 1.85.0 (4d91de4e4 2025-02-17)
binary: rustc
commit-hash: 4d91de4e48198da2e33413efdcd9cd2cc0c46688
commit-date: 2025-02-17
host: aarch64-apple-darwin
release: 1.85.0
LLVM version: 19.1.7

Background

The NAPI-RS project enables the compilation of Rust projects to the wasm32-wasip1-threads target, allowing them to run in Node.js. However, since Node.js doesn’t natively support WASI threads, we simulate a threaded environment using emnapi. This setup worked effectively until version 1.85.0. Many projects, such as rolldown and rspack, have been successfully compiled to WebAssembly targets to run seamlessly on platforms like StackBlitz and in web browsers.

@Brooooooklyn Brooooooklyn added the C-bug Category: This is a bug. label Feb 24, 2025
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Feb 24, 2025
@Noratrieb
Copy link
Member

Noratrieb commented Feb 24, 2025

When users compile to wasm32-wasip1-threads, they don't necessarily enable target_feature = "atomics", but the rust-std of wasm32-wasip1-threads always enables atomics, which causes some issues: #137510.

#133472 (comment)

The wasm32-wasip1-threads target enables the atomics feature by default, so it's always used (unless someone unsoundly turns it off).

options.features = "+atomics,+bulk-memory,+mutable-globals".into();

@Noratrieb Noratrieb added O-wasm Target: WASM (WebAssembly), http://webassembly.org/ A-thread-locals Area: Thread local storage (TLS) T-libs Relevant to the library team, which will review and decide on the PR/issue. O-wasi Operating system: Wasi, Webassembly System Interface and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Feb 24, 2025
@Brooooooklyn
Copy link
Author

The wasm32-wasip1-threads target enables the atomics feature by default, so it's always used (unless someone unsoundly turns it off).

I misunderstood, there should be other reasons for this. In the Node.js context, I suspect it's because libc::pthread_key_delete calls an Atomics.wait API, causing the JavaScript code to hang on the main thread.

@surban
Copy link
Contributor

surban commented Feb 24, 2025

Could you reproduce this without node.js, i.e. minimal example running on wasmtime?

@Brooooooklyn
Copy link
Author

Could you reproduce this without node.js, i.e. minimal example running on wasmtime?

@surban it works fine on wasmtime, I think this might be an issue specific to Node.js/Browser

@surban
Copy link
Contributor

surban commented Feb 24, 2025

Could you reproduce this without node.js, i.e. minimal example running on wasmtime?

@surban it works fine on wasmtime, I think this might be an issue specific to Node.js/Browser

I don't know how exactly emnapi is emulating threads, but since this works fine with native threads on wasmtime, I suspect the issue is with emnapi and not Rust.

@Brooooooklyn
Copy link
Author

I don't know how exactly emnapi is emulating threads, but since this works fine with native threads on wasmtime, I suspect the issue is with emnapi and not Rust.

The logic stuck before going into emnapi, there is no emnapi and NAPI-RS in my repro: https://github.com/Brooooooklyn/rust-wasi-thread-local-stuck

use std::cell::RefCell;
use std::collections::HashMap;

thread_local! {
  static THREAD_LOCAL_DATA: RefCell<HashMap<u32, u32>> = RefCell::new(HashMap::new());
}

#[unsafe(no_mangle)]
pub unsafe extern "C" fn set_thread_local_data(key: u32, value: u32) {
    THREAD_LOCAL_DATA.with(|data| {
        data.borrow_mut().insert(key, value);
    });
}
import { WASI } from 'node:wasi'
import fs from 'node:fs'
const wasi = new WASI({
  version: 'preview1',
  env: process.env,
})

const wasm = await WebAssembly.compile(
  new Uint8Array(fs.readFileSync('target/wasm32-wasip1-threads/debug/wasi_thread_local.wasm')),
)

const instance = await WebAssembly.instantiate(wasm, {
  ...wasi.getImportObject(),
  env: {
    memory: new WebAssembly.Memory({ initial: 1024, maximum: 65536, shared: true }),
  },
})

wasi.initialize(instance);

console.log(instance.exports.set_thread_local_data(1, 2))

@Brooooooklyn
Copy link
Author

I've updated my repro and added browser examples, it stuck in FireFox & Chrome

@surban
Copy link
Contributor

surban commented Feb 24, 2025

You are targeting wasm32-wasip1-threads, which assumes that the runtime supports atomics and threads. However, are you sure that node.js indeed supports WASI threads and atomics? Even if no threads are spawned atomics support is essential.

@Brooooooklyn
Copy link
Author

At least all modern browsers support Atomics and Threads: https://caniuse.com/wasm-threads

And it stucks on Chrome too, my Chrome version: 133.0.6943.127 (Official Build) (arm64), you can try it in my repro

@surban
Copy link
Contributor

surban commented Feb 24, 2025

At least all modern browsers support Atomics and Threads: https://caniuse.com/wasm-threads

And it stucks on Chrome too, my Chrome version: 133.0.6943.127 (Official Build) (arm64), you can try it in my repro

No web browser currently has native support for WASI and the polyfill you are using does not seem to have support for threads (see list of implemented syscalls at the end of the page).

At the moment I am not convinced that your environment has a complaint implementation of atomics and threads. The fact that your example is working fine using wasmtime also suggests that the issue is with node.js and the WASI polyfill you are using.

@surban
Copy link
Contributor

surban commented Feb 24, 2025

Also, one more thing that struck me as odd:

You seem to be calling functions inside a WASI module. This is only supported when the module is a WASI reactor, however, by default Rust builds a WASI command. Your can change this by setting your RUSTFLAGS accordingly in .cargo/config.toml:

[target.wasm32-wasip1-threads]
rustflags = ["-Z", "wasi-exec-model=reactor"]

@Brooooooklyn
Copy link
Author

the issue is with node.js and the WASI polyfill you are using

From what I found, the program stuck at the std::sys::thread_local::guard::enable

Image

And there is no polyfill in Node.js at all.

Even if no threads are spawned atomics support is essential.

I'm sure Node.js wasm runtime supports Atomics.

Brooooooklyn added a commit to napi-rs/napi-rs that referenced this issue Feb 24, 2025
@Brooooooklyn
Copy link
Author

After breakpoint debugging in @toyobayashi, we determined that the issue was caused by not setting wasi-exec-model=reactor. In Node.js and the browser, after using nightly and adding rustflags = ["-Z", "wasi-exec-model=reactor"], the issue disappeared.

My next question is, if the wasi reactor exec model is a nightly feature, should #133472 be stabilized along with the wasi reactor exec model? Other than Node.js, other wasi runtimes don't have this problem because they don't yet support importing and running wasm as a module, right?

@alexcrichton
Copy link
Member

Do you have intuition for why wasi-exec-model=reactor is fixing the issue? I don't doubt that it changes things to where your application is working but I'd be curious to dig into the exact difference as to why. The wasm32-wasip1-threads target is relatively experimental still from an ecosystem perspective so this might be indicative of a deeper bug worth fixing. Ideally this would be fixable without needing to stabilize more flags (which is its own can of worms)

@toyobayashi
Copy link

toyobayashi commented Feb 25, 2025

Do you have intuition for why wasi-exec-model=reactor is fixing the issue?

@alexcrichton Context: toyobayashi/emnapi#136 (comment)

__wasm_init_tp is not called in _initialize https://github.com/WebAssembly/wasi-libc/blob/e9524a0980b9bb6bb92e87a41ed1055bdda5bb86/libc-bottom-half/crt/crt1-reactor.c#L16

cause infinit loop here: https://github.com/WebAssembly/wasi-libc/blob/e9524a0980b9bb6bb92e87a41ed1055bdda5bb86/libc-top-half/musl/src/thread/pthread_key_create.c#L59-L60

Other than Node.js, other wasi runtimes don't have this problem because they don't yet support importing and running wasm as a module, right?

@Brooooooklyn wasmtime also does try to get instance.exports._initialize and invoke it

https://github.com/bytecodealliance/wasmtime/blob/1f24222f89f6ae243bb5968d7ab8f70f3fe5bae5/src/commands/run.rs#L440-L446

As for why there is no problem, my guess is that the implementation of the native platform makes the code running branch different from that on Node and the browser, so there is no infinite loop problem. The specific details also need further investigation.

#137510 (comment)

@surban
Copy link
Contributor

surban commented Feb 25, 2025

My guess would be that for a WASI command the CRT performs cleanup after _start is executed and releases pthread objects etc.

The spec says:

Command instances may assume that they will be called from the environment at most once. Command instances may assume that none of their exports are accessed outside the duration of that call.

I don't see a way to legally call functions inside a WASI module without setting its type to reactor.

@toyobayashi
Copy link

toyobayashi commented Feb 26, 2025

@Brooooooklyn Update:

Using wasmtime to run your example also get stuck.

cargo build --target wasm32-wasip1-threads

wasmtime run \
         -S threads=y \
         -W threads=y \
         --invoke set_thread_local_data \
         ./target/wasm32-wasip1-threads/debug/wasi_thread_local.wasm \
         1 2

The result is the same no matter it is in Node.js or native. I think the problem is clear now (#137510 (comment))

if the wasi reactor exec model is a nightly feature, should #133472 be stabilized along with the wasi reactor exec model?

Back to this question, for WASI reactor, since wasi-libc has been using _initialize to initialize threads and call ctors for a long time, when will Rust stabilize wasi-exec-model=reactor?

@alexcrichton
Copy link
Member

Ah ok yeah that makes sense to me, some de-initializing of data in wasi-libc probably isn't expected to be run from multiple threads or then re-run on de-initialized state.

FWIW I don't think there's a path forward for stabilizing -Zwasi-exec-model=... at this time because WASIp2 and beyond don't really work with reactors or commands, there's just components and exports and such. In that sense the "best fix" here is to probably update the various arguments used on wasm32-wasip1-threads or update wasi-libc to work in more situations. In essence someone needs to sit down and map out how best the linker, toolchain, and wasi-libc all interact with each other here.

The problem is that wasm32-wasip1-threads is effectively a dead-end in the design space of threads-on-wasm. Nowadays development efforts are focused on shared-everything-threads instead which provides a way forward for threads-and-wasm. Given that it's unlikely that someone will be able to sit down and balance all the concerns I mentioned above since most folks would instead work on shared-everything-threads instead.

@surban
Copy link
Contributor

surban commented Feb 27, 2025

FWIW I don't think there's a path forward for stabilizing -Zwasi-exec-model=... at this time because WASIp2 and beyond don't really work with reactors or commands, there's just components and exports and such.

I understand that wasi-exec-model is a dead-end for WASIp2 and beyond, but I don't see how this blocks stabilization of wasi-exec-model for WASIp1, since this is a stable target and definitely has its use. A possibility would be to rename the flag to wasip1-exec-model to make it clear what it supports.

@alexcrichton
Copy link
Member

It's true that it doesn't block anything per-se, but it'll make it harder to drum up support and folks willing to champion stabilization. I personally wouldn't block such an effort of course, but I agree that renaming it would probably be best.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-thread-locals Area: Thread local storage (TLS) C-bug Category: This is a bug. O-wasi Operating system: Wasi, Webassembly System Interface O-wasm Target: WASM (WebAssembly), http://webassembly.org/ T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants