-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
config-linux: Deprecate device access denial #1214
base: main
Are you sure you want to change the base?
Conversation
ced0f17
to
bb837ae
Compare
Bump? |
In cgroup v2, runc and other major OCI runtimes has implemented not only systemd but also its original cgroup v2 driver. How about this behavior? As far as I know, it attempts to emulate the cgroup v1's behavior. |
Separate allow/deny lists are specific to device controller existing only in cgroup v1. Current semantics for devices that don't match neither allow nor deny is confusing. cgroup v2 implements access control on the default hierarchy with BPF hooks. Follow the approach of systemd (refer to systemd.resource(5)) with DevicePolicy=strict, i.e. consider all devices denied by default and add only entries for devices that should be allowed. This will simplify the job for runtimes that use systemd for container cgroup configuration. For starters, mention that "allow" entries that don't stick to this approach are deprecated. Next step would be removal of the "allow" attribute and implicit denial on all devices. Signed-off-by: Michal Koutný <[email protected]>
bb837ae
to
95da17d
Compare
What does this refer to? (cgroupfs driver? Does it mean it container runtimes synthesize BPF progs that support both allow/deny list like the v1 device controller) |
yes |
this looks like a breaking change, when will we be able to remove it? I am not sure this is the right thing to do anyway. The devices cgroup allows to be configured this way (and with eBPF there is even more flexibility in doing it), so no need to break the cgroupfs driver only because systemd puts more limitations on the way a cgroup can be configured. |
cgroupfs driver could be left untouched since as you write, there is versatility of eBPF. |
so in this case we won't need to deprecate the current setting, right? We don't mention systemd anywhere in the cgroups configuration at the moment, so I don't think we can merge this change as it is now |
But how could runtimes implement this (deny lists) on top of systemd driver? |
they can still implement it using cgroupfs. We can document it, but I don't see why we should block this possibility just because systemd doesn't allow it. Anyway this is my personal preference, not sure if other maintainers agree |
I'm coming from the area where cgroupfs driver cannot be used on systemd-managed userspace. I have to check what's the recent progress, since it is possible to use cgroupfs there as well if runtime implements delegation properly. Then this deprecation proposal would be deprecated itself :-) One independent argument for deprecating denylists though, is that they're less secure than explicit allowlists (and deny by default). |
that is where we disagree :-) There are already enough opinionated tools. The OCI runtime spec should allow what is possible to do and be as close as possible to the kernel features enabling that. It should not try to educate users, it is unlikely an end-user uses directly the OCI runtime anyway. |
They actually can, using |
@cyphar WDYT? |
Separate allow/deny lists are specific to device controller existing only in cgroup v1. Current semantics for devices that don't match neither allow nor deny is confusing.
cgroup v2 implements access control on the default hierarchy with BPF hooks. Follow the approach of systemd (refer to systemd.resource(5)) with DevicePolicy=strict, i.e. consider all devices denied by default and add only entries for devices that should be allowed.
This will simplify the job for runtimes that use systemd for container cgroup configuration.
For starters, mention that "allow" entries that don't stick to the this approach are deprecated. Next step would be removal of the "allow" attribute and implicit denial on all devices.