You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
os: overhaul handling of PID vs pidfd within Process
There are several issues with pidfd handling today:
* The zero value of a Process makes the handle field appear valid, so
methods attempt to use it as a pidfd rather than falling back to the
PID as they should (#67634).
* If a process doesn't exist, FindProcess returns a Process with Pid ==
-2, which is not a compatible change (#67640).
* pidfd close is racy as-is. A Release call or successful Wait will
clear the handle field and close the pidfd. However, a concurrent call
may have already loaded the handle field and could then proceed to use
the closed FD (which could have been reopened as a different pidfd,
targeting a different process) (#67641).
This CL performs multiple structural changes to the internals of
Process.
First and foremost, each method is refactored to clearly select either
pidfd or raw pid mode. Previously, raw pid mode was structured as a
fallback when pidfd mode is unavailable. This works fine, but it does
not make it clear that a given Process object either always uses pidfd
or always uses raw pid. Since each mode needs to handle different race
conditions, it helps to make it clear that we can't switch between modes
within a single Process object.
Second, pidfd close safety is handled by reference counting uses of the
FD. The last user of the FD will close the FD. For example, this means
that with concurrent Release and Signal, the Signal call may be the one
to close the FD. This is the bulk of this CL, though I find the end
result makes the overall implementation easier to reason about.
Third, the PID path handles a similar race condtion between Wait and
Kill: Wait frees the PID value in the kernel, which could be reallocated
causing Kill to target the wrong process. This is handled with a done
flag and a mutex. The done flag now shares the same state field used for
the handle.
Similarly, the Windows implementation reuses all of the handle reference
counting that Linux uses. This means the implementations more
consistent, and make Windows safe against the same handle reuse
problems. (Though I am unsure if Windows ever reuses handles).
Wait has a slight behavior change on Windows: previously Wait after
Release or an earlier Wait would hang indefinitely (WaitForSingleObject
on syscall.InvalidHandle waits indefinitely). Now it returns the same
errors as Linux (EINVAL and ErrProcessDone, respectively).
Similarly, Release on Windows no longer returns close errors, as it may
not actually be the place where the close occurs.
Fixes#67634.
Fixes#67640.
Fixes#67641.
Updates #67642.
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-windows-amd64-longtest
Change-Id: I2ad998f7b67d32031e6f870e8533dbd55d3c3d10
Reviewed-on: https://go-review.googlesource.com/c/go/+/588675
Reviewed-by: Austin Clements <[email protected]>
LUCI-TryBot-Result: Go LUCI <[email protected]>
0 commit comments