-
Notifications
You must be signed in to change notification settings - Fork 778
Questions Related to the "Abnormal" Memory Usage of Binaryen #6239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Atm Binaryen is optimized for short-running processes, like calling Perhaps we could add an option to not intern strings. That would just need a replacement file for |
We similarly intern types and never delete them, but we do have a function called |
Hey kripken, I think it may not be necessary to implement a new option that support not interning strings. Instead, like @tlively said in the comment, I wonder if it's acceptable to export a C API to clear up these interned types or strings which could make things easier. |
Yes, functions like |
I'm not opposed to such an API. |
Yeah, it's not safe for any thread to be doing anything with types while another thread is calling this function. But more generally, any types from before the call cannot be used after the call, which is a lifetime issue that code using types normally wouldn't need to think about. |
How about to use arena or bump allocator for interner? I guess generated names could be quite long and continuously grow in "converge" mode, which may caused to OOM, especially for wasm32 binaryen builds. It would be nice to be able to do a reset for interner state. |
You're right, if there are multiple threads then things could get messy. I do not use multiple threads in my fuzzing framework, so a clean up function is perfectly suitable for me. If that's ok for you to have this kind of C API, I will try adding one and issue a pull request ;). |
Hey MaxGraey, I think for this case arena allocator is a pretty nice choice, but I'm not an expert in designing such memory allocators so it may depends on developers. |
@mobsceneZ, sounds good! I look forward to the PR. |
A bump allocator is separately a good idea here. We can use the existing MixedArena which gives exactly what we want. A separate PR with that would also be welcome here. |
Closing because the questions were answered and it looks like there was a good solution found, mentioned in #6298. |
Hi developers, I am incorporating Binaryen as third-party library into my fuzzing framework. In short, Binaryen is used to parse input Wasm module, do some mutation and generation work and emit a new testcase back in this case.
Everything went fine until I deployed it on a cloud server. The memory usage of my fuzzing framework kept going on and eventually kernel's OOM Killer killed my fuzzing process.
It really took me a lot of effort to figure out the reason: during the mutation or generation stage, our work will generate block/loop names for BinaryenBlock()/BinaryenLoop() C API, which will further call
IString::interned
to update some static variables:binaryen/src/support/istring.h
Line 36 in 6453fd5
The problem is that fuzzing may generate different block/loop names for different input Wasm module, and these names become useless when we have finished the mutation/generation stage for this specific module, but the corresponding block/loop names are not erased from these static variables accordingly! Eventually, it results in excessive memory occupation for maintaining these meaningless names.
So, I wonder if there is any way to erase elements inside these static/global variables on a per-module basis, so that the overall memory usage is affordable.
The text was updated successfully, but these errors were encountered: