Skip to content

gh-114940: Use fine-grained mutex protection for PyInterpreterState.threads #125561

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 27 commits into from

Conversation

rruuaanng
Copy link
Contributor

@rruuaanng rruuaanng commented Oct 16, 2024

@rruuaanng rruuaanng changed the title Add mutex member to PyInterpreterState.threads gh-114940: Add mutex member to PyInterpreterState.threads Oct 16, 2024
@ZeroIntensity
Copy link
Member

ZeroIntensity commented Oct 16, 2024

This shouldn't have a NEWS entry--it's not a user-facing change. (The interpreter state structure is not public.)

While we're here, it might be good to replace the uses of HEAD_LOCK for locking a single interpreter, with INTERP_THREAD_LOCK. (In that case, this does need a news entry, as thread safety is technically user-facing--ping me to remove the skip news label if you take that approach.)

@rruuaanng
Copy link
Contributor Author

I chose to remove NEWS, and HEAD_LOCK will not be completely replaced. It depends on @ericsnowcurrently plans (if it wants to make the interpreter thread use INTERP_THREAD_LOCK instead of HEAD_LOCK for everything, I will replace all uses of HEAD_LOCK in the entire code base

@ZeroIntensity
Copy link
Member

ZeroIntensity commented Oct 16, 2024

HEAD_LOCK will not be completely replaced

Yeah, I know. We just need to replace it in the PyInterpreterState* (and possibly _PyXI*?) functions.

@rruuaanng
Copy link
Contributor Author

rruuaanng commented Oct 16, 2024

Currently I have replaced gc_free_threading.c, ceval.c, ceval_gil.c, instrumentation.c.
Edit:
All HEAD_LOCK, But I'm not sure whether to replace them. In my opinion, these files actually use locks just to protect the current process, but they lock the entire interpreter process.

@ZeroIntensity
Copy link
Member

Hmm, that doesn't seem right. I would expect most of the changes to be in pystate.c. I guess there could be others, but I'll let Eric weigh in on that. (I'm still learning about how isolated subinterpreters work.)

@rruuaanng
Copy link
Contributor Author

I'm actually not sure since this only seems to be related to pystate.c

@rruuaanng
Copy link
Contributor Author

rruuaanng commented Oct 17, 2024

Sorry to ping you directly @ericsnowcurrently, a lot of the stuff in gc_free_threading seems to be about cleaning up the current interpreter state, I changed them to lock the current interpreter, Is this correct?

@ZeroIntensity
Copy link
Member

Cleaning up an interpreter state (assuming you mean PyInterpreterState_Clear and PyInterpreterState_Delete) should be thread-safe, multiple threads won't be accessing something while it's getting deallocated.

@rruuaanng
Copy link
Contributor Author

Sorry, I seem to have described it wrong. They are all traversing the current state, not cleaning. Please forgive my wrong words.

@rruuaanng rruuaanng force-pushed the _gh114940 branch 2 times, most recently from 42fbb68 to 29c4ccb Compare October 18, 2024 05:43
@rruuaanng rruuaanng marked this pull request as ready for review October 18, 2024 07:01
@rruuaanng
Copy link
Contributor Author

Link this issue to this PR:
#114016

@rruuaanng
Copy link
Contributor Author

cc @colesbury @vstinner

Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires more careful auditing of code that loops over interpreters and threads. There's still functions that acquire the wrong lock or are now missing a lock, because the code assumed that an outer HEAD_LOCK() was sufficient.

  • PyOS_BeforeFork/PyOS_AfterFork_Parent also need to acquire the interpreter lock
  • get_reftotal loops over threads if Py_GIL_DISABLED
  • get_mimalloc_allocated_blocks loops over threads

@@ -791,18 +790,16 @@ interpreter_clear(PyInterpreterState *interp, PyThreadState *tstate)
}

// Clear the current/main thread state last.
HEAD_LOCK(runtime);
INTERP_HEAD_LOCK(interp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @ericsnowcurrently pointed this out in a different comment, but we should not change where the lock/unlocks happen in this PR.

The existing code has a race condition, but holding the lock across the PyThreadState_Clear() call risks deadlock (see the code comment below).

@rruuaanng
Copy link
Contributor Author

In fact, my current plan is to add mutex locks to all operations related to thread members. In fact, this is the meaning of its existence. But I'm thinking, as you said, maybe we should protect other functions worth adding locks.

@ZeroIntensity
Copy link
Member

ZeroIntensity commented Oct 23, 2024

A few notes here.

  • I think this needs a NEWS entry now, because this is no longer internal. Something like "Improved performance with runtime thread contention" could work.
  • With that being said, RUANG, I suggest you put extra focus on listening to advice here--this is a complicated issue you're addressing, and your previous history with PRs hasn't gone well.
  • Please @ mention me when you believe this is fully ready, so I can run the buildbot fleet on it.

@rruuaanng
Copy link
Contributor Author

Yes, for this PR, I will be very focused and listen to all the advice. Because this change is very important, in fact, I have been evaluating the status of multithreading and interpreter recently. I hope this PR can be audited by more people. Thank you for your good advice.

@rruuaanng rruuaanng marked this pull request as draft October 24, 2024 01:12
@rruuaanng
Copy link
Contributor Author

Link this issue to this PR:
#125908
It describes the check when handling PyThreadState_Clear to make sure that HEAD_LOCK is not held before the call.·

And, CI warnings can be ignored, they come from (runtime not used)

#ifdef Py_GIL_DISABLED
    assert(runtime->stoptheworld.world_stopped);
#endif

@rruuaanng
Copy link
Contributor Author

I replaced the lock of PyOS_BeforeFork/PyOS_AfterFork_Parent, and it turns out that this seems to be unfeasible. They will be deadlocked when running test_fork and test_subprocess.
(At least my evaluation results in ubuntu are as mentioned above).

@rruuaanng rruuaanng marked this pull request as ready for review October 25, 2024 13:17
@ericsnowcurrently
Copy link
Member

In the interest of understanding what actually needs to be switched to the new lock, I ended up implementing this change separately: gh-127037. I'd be fine if we updated this PR to merge in my changes, since they're very similar and I'd be glad for you (@rruuaanng) to get credit. 😄

@rruuaanng
Copy link
Contributor Author

rruuaanng commented Nov 20, 2024

@ericsnowcurrently Oh, at a glance, I seem to think they are the same, but they are not consistent. This is very helpful to me. If we wait until the merger, maybe this PR can be merged with it.

Edit

They seem to be the same :(

@ZeroIntensity
Copy link
Member

I think just adding Co-authored-by on Eric's PR is the way to go. James'/RUANG's history of PRs isn't great, so I don't fully trust him to merge the changes correctly (especially with the previous force push fiasco on the mind), or to correctly address reviews on this large of a change. Sorry :(

@rruuaanng
Copy link
Contributor Author

If it's okay for me to be a co-signed, I think it's good 😀

@ericsnowcurrently
Copy link
Member

I ended up splitting up the changes into gh-127077 and gh-127115, and credited @rruuaanng on the latter one. Let's close this PR.

@rruuaanng rruuaanng deleted the _gh114940 branch November 21, 2024 23:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants