Skip to content

Commit 72958ac

Browse files
committed
std: Spin for a global malloc lock on wasm32
There's lots of comments in the code, but the main gist of this commit is that the acquisition of the global malloc lock on the `wasm32-unknown-unknown` target when threads are enabled will not spin on contention rather than block.
1 parent 1999a22 commit 72958ac

File tree

1 file changed

+80
-15
lines changed

1 file changed

+80
-15
lines changed

src/libstd/sys/wasm/alloc.rs

+80-15
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,6 @@ unsafe impl GlobalAlloc for System {
4949

5050
#[cfg(target_feature = "atomics")]
5151
mod lock {
52-
use crate::arch::wasm32;
5352
use crate::sync::atomic::{AtomicI32, Ordering::SeqCst};
5453

5554
static LOCKED: AtomicI32 = AtomicI32::new(0);
@@ -61,27 +60,93 @@ mod lock {
6160
if LOCKED.swap(1, SeqCst) == 0 {
6261
return DropLock
6362
}
64-
unsafe {
65-
let r = wasm32::i32_atomic_wait(
66-
&LOCKED as *const AtomicI32 as *mut i32,
67-
1, // expected value
68-
-1, // timeout
69-
);
70-
debug_assert!(r == 0 || r == 1);
71-
}
63+
// Ok so here's where things get a little depressing. At this point
64+
// in time we need to synchronously acquire a lock, but we're
65+
// contending with some other thread. Typically we'd execute some
66+
// form of `i32.atomic.wait` like so:
67+
//
68+
// unsafe {
69+
// let r = core::arch::wasm32::i32_atomic_wait(
70+
// &LOCKED as *const AtomicI32 as *mut i32,
71+
// 1, // expected value
72+
// -1, // timeout
73+
// );
74+
// debug_assert!(r == 0 || r == 1);
75+
// }
76+
//
77+
// Unfortunately though in doing so we would cause issues for the
78+
// main thread. The main thread in a web browser *cannot ever
79+
// block*, no exceptions. This means that the main thread can't
80+
// actually execute the `i32.atomic.wait` instruction.
81+
//
82+
// As a result if we want to work within the context of browsers we
83+
// need to figure out some sort of allocation scheme for the main
84+
// thread where when there's contention on the global malloc lock we
85+
// do... something.
86+
//
87+
// Possible ideas include:
88+
//
89+
// 1. Attempt to acquire the global lock. If it fails, fall back to
90+
// memory allocation via `memory.grow`. Later just ... somehow
91+
// ... inject this raw page back into the main allocator as it
92+
// gets sliced up over time. This strategy has the downside of
93+
// forcing allocation of a page to happen whenever the main
94+
// thread contents with other threads, which is unfortunate.
95+
//
96+
// 2. Maintain a form of "two level" allocator scheme where the main
97+
// thread has its own allocator. Somehow this allocator would
98+
// also be balanced with a global allocator, not only to have
99+
// allocations cross between threads but also to ensure that the
100+
// two allocators stay "balanced" in terms of free'd memory and
101+
// such. This, however, seems significantly complicated.
102+
//
103+
// Out of a lack of other ideas, the current strategy implemented
104+
// here is to simply spin. Typical spin loop algorithms have some
105+
// form of "hint" here to the CPU that it's what we're doing to
106+
// ensure that the CPU doesn't get too hot, but wasm doesn't have
107+
// such an instruction.
108+
//
109+
// To be clear, spinning here is not a great solution.
110+
// Another thread with the lock may take quite a long time to wake
111+
// up. For example it could be in `memory.grow` or it could be
112+
// evicted from the CPU for a timeslice like 10ms. For these periods
113+
// of time our thread will "helpfully" sit here and eat CPU time
114+
// until it itself is evicted or the lock holder finishes. This
115+
// means we're just burning and wasting CPU time to no one's
116+
// benefit.
117+
//
118+
// Spinning does have the nice properties, though, of being
119+
// semantically correct, being fair to all threads for memory
120+
// allocation, and being simple enough to implement.
121+
//
122+
// This will surely (hopefully) be replaced in the future with a
123+
// real memory allocator that can handle the restriction of the main
124+
// thread.
125+
//
126+
//
127+
// FIXME: We can also possibly add an optimization here to detect
128+
// when a thread is the main thread or not and block on all
129+
// non-main-thread threads. Currently, however, we have no way
130+
// of knowing which wasm thread is on the browser main thread, but
131+
// if we could figure out we could at least somewhat mitigate the
132+
// cost of this spinning.
72133
}
73134
}
74135

75136
impl Drop for DropLock {
76137
fn drop(&mut self) {
77138
let r = LOCKED.swap(0, SeqCst);
78139
debug_assert_eq!(r, 1);
79-
unsafe {
80-
wasm32::atomic_notify(
81-
&LOCKED as *const AtomicI32 as *mut i32,
82-
1, // only one thread
83-
);
84-
}
140+
141+
// Note that due to the above logic we don't actually need to wake
142+
// anyone up, but if we did it'd likely look something like this:
143+
//
144+
// unsafe {
145+
// core::arch::wasm32::atomic_notify(
146+
// &LOCKED as *const AtomicI32 as *mut i32,
147+
// 1, // only one thread
148+
// );
149+
// }
85150
}
86151
}
87152
}

0 commit comments

Comments
 (0)