Skip to content

crash when dlopen/dlclose rust-made *.so multi-times on android/ohos aarch64 #135815

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Rust401 opened this issue Jan 21, 2025 · 8 comments
Open
Labels
A-linkage Area: linking into static, shared libraries and binaries C-discussion Category: Discussion or questions that doesn't represent real issues. O-android Operating system: Android

Comments

@Rust401
Copy link

Rust401 commented Jan 21, 2025

crash when dlopen/dlclose rust-made *.so multi-times on android/ohos aarch64

Senario:

  1. Compile a Rust source file into a shared object (.so) file.
  2. A C++ file uses dlopen to load the .so file and retrieve the target symbol.
  3. Call the Rust function within the loaded .so.
  4. Use dlclose to unload the .so file.

Related issues #134820

reproduction code

Here's the translation of the provided text:

Known Information:

  1. There is no issue with regular arm64, but disassembly reveals differences involving the use of TPIDR_EL0.
  2. The main difference between ohos/an and arm64 Linux options is has_thread_local=true and tls-model=emulated, but these two compilation options do not seem to take effect when compiling the shared object (suspected that the standard library needs to be recompiled with these options). The implementation of printl is part of the Rust standard library.
  3. Upon reviewing the code, it was found that the tls-model=emulated option affects the rustc compilation behavior (either using TPIDR_EL0 or emulated TLS, by affecting the LLVM backend).

This seems to be a bug in the rustc implementation. In the combination of Android and OHOS target platform options, during the println process, a thread-local variable is implicitly created, and the thread-local resources initialized through lazy_init are not properly released when dlclose is called.

@Rust401 Rust401 added the C-bug Category: This is a bug. label Jan 21, 2025
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jan 21, 2025
@bjorn3
Copy link
Member

bjorn3 commented Jan 21, 2025

What is the difference between the problem in this issue and the one in #134820?

@Rust401
Copy link
Author

Rust401 commented Jan 24, 2025

What is the difference between the problem in this issue and the one in #134820?

134820 compile rust src to .a, then link to a c++ so.
135815 compile rust to .so directly.

Additionally, we have provided some new information, which we thought is very important.
The main difference between cargo build --target aarch64-linux-android and cargo build --target aarch64-unknown-linux-gnu is options has_thread_local and tls-model.

for android/ohos, has_thread_local=false and tls-model=emulated, which are diff from aarch64-linux

tls-model=emulated will affect llvm backend
has_thread_local will affect threadlocal behavior in std

But the whole path we still not figure out.

@Rust401
Copy link
Author

Rust401 commented Jan 26, 2025

has_thread_local=true play a key role.
export RUSTFLAGS="-Z has_thread_local=true" and use -Z build-std for cargo build

issue will be fix and the annoying pthread_key_create will no longer be called repeatedly.

@bjorn3
Copy link
Member

bjorn3 commented Jan 26, 2025

Most libc implementations I know of silently refuse to unload shared libraries when they make use of native TLS support as -Zhas-thread-local would use. Fair chance bionic is one of those libc implementations too.

@lolbinarycat lolbinarycat added O-android Operating system: Android A-linkage Area: linking into static, shared libraries and binaries labels Feb 3, 2025
@Rust401
Copy link
Author

Rust401 commented Feb 10, 2025

Most libc implementations I know of silently refuse to unload shared libraries when they make use of native TLS support as -Zhas-thread-local would use. Fair chance bionic is one of those libc implementations too.

However, in the Rust code compiled with the Android and OHOS build options, simply calling println! implicitly generates a thread-local key. This seems to suggest that calling Rust code via dlopen/dlclose is not supported on Android/OHOS, which clearly seems unreasonable.

@bjorn3
Copy link
Member

bjorn3 commented Feb 10, 2025

dlopen is supported, dlclose is not. Maybe we should just set the ELF flag to prevent shared libraries from getting unloaded on dlclose.

@Rust401
Copy link
Author

Rust401 commented Feb 12, 2025

Is there a way to support dlclose for so with Rust code on Android/OHOS platforms? This is important because in terminal scenarios, there is a strict requirement for persistent memory, and many usage patterns of so involve dlclose and dlopen.

@bjorn3
Copy link
Member

bjorn3 commented Feb 12, 2025

A dlopen of the same shared library without dlclose in between will not increase memory usage. It will just increase the reference count of the already loaded copy of the shared library.

Dlclose is unfortunately simply not supported when thread local storage is used. And even without thread local storage there are various restrictions you need to follow to safely support dlclose. For example you may not spawn a background thread that outlives the dlclose, you may not register an atexit handler or similar from your shared library, you may not keep any function pointers alive that could be called after dlclose and more.

Running your code in a separate process which gets terminated at the point where you currently try to use dlclose is likely much easier to implement.

@jieyouxu jieyouxu added C-discussion Category: Discussion or questions that doesn't represent real issues. and removed C-bug Category: This is a bug. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Apr 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries C-discussion Category: Discussion or questions that doesn't represent real issues. O-android Operating system: Android
Projects
None yet
Development

No branches or pull requests

5 participants