-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Single-Phase Init Extension Module Init Functions Still Run in Isolated Interpreters #117953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
) This is a collection of very basic cleanups I've pulled out of gh-118116. It is mostly renaming variables and moving a couple bits of code in functionally equivalent ways.
These are cleanups I've pulled out of gh-118116. Mostly, this change moves code around to align with some future changes and to improve clarity a little. There is one very small change in behavior: we now add the module to the per-interpreter caches after updating the global state, rather than before.
…inglephase or Not (gh-118193) This change makes other upcoming changes simpler.
This helps with a later change that splits up _PyImport_LoadDynamicModuleWithSpec().
…-118250) A couple of refleaks slipped through in pythongh-118194. This takes care of them. (AKA _Py_ext_module_loader_info_init() does not steal references.)
Continuing here from the closed PR. @ericsnowcurrently wrote:
Thanks! AFAICS, the current state is that IMO, current PRs should simply avoid
Yes, looks like this tracking will be needed in order to use this for more than asserts. |
Basically, I've turned most of _PyImport_LoadDynamicModuleWithSpec() into two new functions (_PyImport_GetModInitFunc() and _PyImport_RunModInitFunc()) and moved the rest of it out into _imp_create_dynamic_impl(). There shouldn't be any changes in behavior. This change makes some future changes simpler. This is particularly relevant to potentially calling each module init function in the main interpreter first. Thus the critical part of the PR is the addition of _PyImport_RunModInitFunc(), which is strictly focused on running the init func and validating the result. A later PR will take it a step farther by capturing error information rather than raising exceptions. FWIW, this change also helps readers by clarifying a bit more about what happens when an extension/builtin module is imported.
Right. Tracking it *on the module def would require hiding the bit(s) in one of the existing fields. That's doable, but I'd do that only if there wasn't another good place to stick the info. |
…nsions (gh-118204) This change will make some later changes simpler. It also brings more consistent behavior and lower maintenance costs.
…chinery (gh-118205) This change will make some later changes simpler.
This change will make some later changes simpler.
We have only been tracking each module's PyModuleDef. However, there are some problems with that. For example, in some cases we load single-phase init extension modules from def->m_base.m_init or def->m_base.m_copy, but if multiple modules share a def then we can end up with unexpected behavior. With this change, we track the following: * PyModuleDef (same as before) * for some modules, its init function or a copy of its __dict__, but specific to that module * whether it is a builtin/core module or a "dynamic" extension * the interpreter (ID) that owns the cached __dict__ (only if cached) This also makes it easier to remember the module's kind (e.g. single-phase init) and if loading it previously failed, which I'm doing separately.
…h-118157) This change makes sure all extension/builtin modules have their init function run first by the main interpreter before proceeding with import in the original interpreter (main or otherwise). This means when the import of a single-phase init module fails in an isolated subinterpreter, it won't tie any global state/callbacks to the subinterpreter.
The core change has landed, but there are a few small follow-up things to wrap up. |
…ort Machinery (pythongh-118205) This change will make some later changes simpler.
…-118206) This change will make some later changes simpler.
…ongh-118532) We have only been tracking each module's PyModuleDef. However, there are some problems with that. For example, in some cases we load single-phase init extension modules from def->m_base.m_init or def->m_base.m_copy, but if multiple modules share a def then we can end up with unexpected behavior. With this change, we track the following: * PyModuleDef (same as before) * for some modules, its init function or a copy of its __dict__, but specific to that module * whether it is a builtin/core module or a "dynamic" extension * the interpreter (ID) that owns the cached __dict__ (only if cached) This also makes it easier to remember the module's kind (e.g. single-phase init) and if loading it previously failed, which I'm doing separately.
…ythongh-118684) This ensures the kind is always either _Py_ext_module_kind_SINGLEPHASE or _Py_ext_module_kind_MULTIPHASE.
…irst (pythongh-118157) This change makes sure all extension/builtin modules have their init function run first by the main interpreter before proceeding with import in the original interpreter (main or otherwise). This means when the import of a single-phase init module fails in an isolated subinterpreter, it won't tie any global state/callbacks to the subinterpreter.
FYI, in gh-118157 I disabled (expand for an example)
The tests should be re-enabled and made to work before this issue be considered resolved. |
…nGH-120689) (cherry picked from commit 1035fe0) Co-authored-by: Nice Zombies <[email protected]>
…20707) (cherry picked from commit 1035fe0, AKA gh-120689) Co-authored-by: Nice Zombies <[email protected]>
Bug report
Bug description:
When an extension module is imported the first time, we load the shared-object file and get the module's init function from it. We then run that init function and use the returned object to decide if the module is single-phase init or multi-phase init.
For isolated subinterpreters, where
PyInterpreterConfig.check_multi_interp_extensions
(AKAPy_RTFLAGS_MULTI_INTERP_EXTENSIONS
) is True, we immediately fail single-phase init modules. The problem is that at that point the init function has already run, which all sorts of potential side effects and process-global state (including registered callbacks) that we mostly can't clean up.This has come up before, for example with the readline module. It's potentially a bigger problem than I thought at first, so I'd like to get it sorted out for 3.13.
FWIW, the simplest solution I can think of is to always call the module init func from the main interpreter (without necessarily doing all the other import steps there). That would look something like this:
For the main interpreter and non-isolated subinterpreters, nothing different would happen from now; there would be no switching. Also, if the first attempt was in an isolated interpreter (which would fail), a subsequent import of that module in the main interpreter (or a non-isolated one) would succeed.
The only tricky part is, when the init function raises an exception, how to an propagate it from the main interpreter to the subinterpreter. For multi-phase init (if known) we would just call the init func again after switching back. For single-phase init (or unknown) we'd have preserve the exception somehow. This is something I had to deal with for
_interpreters.exec()
, but I'm not sure the same thing will work here.CPython versions tested on:
CPython main branch
Operating systems tested on:
No response
Linked PRs
test_interpreters
properly without GIL #120689test_interpreters
properly without GIL (GH-120689) #120707The text was updated successfully, but these errors were encountered: