Skip to content

[cgroupv2] Enable fuse device #8769

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 4, 2022
Merged

[cgroupv2] Enable fuse device #8769

merged 2 commits into from
Apr 4, 2022

Conversation

Furisto
Copy link
Member

@Furisto Furisto commented Mar 11, 2022

Description

Enable fuse device on nodes that are using cgroup v2.

Related Issue(s)

Fixes #8417

How to test

  • Open workspace on a cgroup v2 system
  • Compile this program (gcc -o fuse_test fuse_test.c)
#define _GNU_SOURCE
#include <unistd.h>

#include <sys/syscall.h>
#include <linux/fs.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>


int main() {
  const char* src_path = "/dev/fuse";
  unsigned int flags = O_RDWR;
  printf("RET: %ld", syscall(SYS_openat, AT_FDCWD, src_path, flags));
}
  • fuse_test should return 3

Notes for reviewers

My initial implementation was using a runc facade like we discussed in #8471, but I switched to instead replacing the ebpf program that runc generates with our own.

  • No need to touch the host (copying the runtime facade, modifying the config of containerd or of other runtimes if we want to support them in the future)
  • No need to specify a runtime class when creating the workspace
  • Implementation uses the same approach as the cgroup v1 implementation
  • Less code

Release Notes

Enable the use of fuse device on cgroup v2 systems

@Furisto Furisto marked this pull request as ready for review March 13, 2022 21:02
@Furisto Furisto requested a review from a team March 13, 2022 21:02
@github-actions github-actions bot added the team: workspace Issue belongs to the Workspace team label Mar 13, 2022
@Furisto
Copy link
Member Author

Furisto commented Mar 13, 2022

@csweichel Any idea why this fails in CI? I can build it with leeway in my workspace without issue.

Comment on lines +44 to 47
func (c *CgroupCustomizerV1) WithCgroupBasePath(basePath string) {
c.cgroupBasePath = basePath
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need this function now.

Suggested change
func (c *CgroupCustomizerV1) WithCgroupBasePath(basePath string) {
c.cgroupBasePath = basePath
}


deviceRules := make([]*devices.Rule, 0)
deviceRules = append(deviceRules, &denyAll, &allowFuse)
for _, device := range specconv.AllowedDevices {
Copy link
Contributor

@utam0k utam0k Mar 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our current use case, we don't specify cgroup devices. In other words, this is all we need to specify for now. However, if we want to add some devices about the runc layer(e.g. kubelet, contained) in the future, this part will arise a problem. However, it may be an option to decide that the cgroup devices are managed here. In fact, we have been used to overwrite /dev/fuse here.
@csweichel WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Furisto Since there is currently a way to get the contents of config.json, why don't we get the original information from here? To be honest, I want to get status.json, but config.json should be sufficient.

func ExtractCGroupPathFromContainer(container containers.Container) (cgroupPath string, err error) {
var spec ocispecs.Spec
err = json.Unmarshal(container.Spec.Value, &spec)
if err != nil {
return
}
if spec.Linux == nil {
return "", xerrors.Errorf("container spec has no Linux section")
}
return spec.Linux.CgroupsPath, nil
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @utam0k , that sounds like an optimization we could potentially add later, yes? In other words, let's try to keep this PR smaller, ship the skateboard, and ship another one in the future (if we need).

Aside from this observation, do you have any other concerns which are blocking? I ask so that we can get @Furisto the feedback needed to have this approved and merged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a little research and as it stands it shouldn't be a problem, because there is no API to change devices normally from k8s. However I think this should be improved. It can create some pretty strange bugs. The skateboard is a very good watchword 🛹

@utam0k
Copy link
Contributor

utam0k commented Mar 13, 2022

Thanks for your PR. Super. I have two questions.

  1. I wonder if the CgroupCustomizerV2 takes effect on the performance because ws-daemon loads eBPF code to kernel every 15 seconds.
  2. I wonder if there may be a time when this override is not reflected for a few seconds since it is a dispatch.

@Furisto
Copy link
Member Author

Furisto commented Mar 14, 2022

Thanks for your PR. Super. I have two questions.

  1. I wonder if the CgroupCustomizerV2 takes effect on the performance because ws-daemon loads eBPF code to kernel every 15 seconds.
  2. I wonder if there may be a time when this override is not reflected for a few seconds since it is a dispatch.
  • Are you thinking of the cpu limiter? The CgroupCustomizer should only be called when a new workspace is added.
  • It is possible but would there be any serious risk? We replace a more restrictive program with a less restrictive program. At worst the customer would not be able to use the fuse device some time after workspace start (or not at all if the event is somehow lost)

@utam0k
Copy link
Contributor

utam0k commented Mar 14, 2022

@Furisto Thanks for your great reply.
Oh, I misunderstood 😭 There is no problem.

Are you thinking of the cpu limiter? The CgroupCustomizer should only be called when a new workspace is added.

In the worst case, I think the item of most concern is that none of the eBPF codes for the device controller is applied due to errors at load time. This can happen because the existing eBPF code is unloaded at load time. Was the eBPF code loaded in the pod cgroup?

It is possible but would there be any serious risk? We replace a more restrictive program with a less restrictive program. At worst the customer would not be able to use the fuse device some time after workspace start (or not at all if the event is somehow lost)

@Furisto
Copy link
Member Author

Furisto commented Mar 15, 2022

@csweichel What do you think of this approach?

@stale
Copy link

stale bot commented Mar 26, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the meta: stale This issue/PR is stale and will be closed soon label Mar 26, 2022
@kylos101 kylos101 added the meta: never-stale This issue can never become stale label Mar 26, 2022
@stale stale bot removed the meta: stale This issue/PR is stale and will be closed soon label Mar 26, 2022
@Furisto
Copy link
Member Author

Furisto commented Mar 29, 2022

/werft run

👍 started the job as gitpod-build-fo-fuse.9

@Furisto Furisto requested a review from a team March 29, 2022 09:38
@github-actions github-actions bot added the team: delivery Issue belongs to the self-hosted team label Mar 29, 2022
@Furisto
Copy link
Member Author

Furisto commented Mar 29, 2022

/werft run

👍 started the job as gitpod-build-fo-fuse.11

Copy link
Contributor

@utam0k utam0k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@roboquat roboquat merged commit 3fb59ce into main Apr 4, 2022
@codecov
Copy link

codecov bot commented Apr 4, 2022

Codecov Report

Merging #8769 (a567595) into main (d469aa6) will decrease coverage by 4.97%.
The diff coverage is n/a.

@@            Coverage Diff            @@
##             main   #8769      +/-   ##
=========================================
- Coverage   12.31%   7.34%   -4.98%     
=========================================
  Files          20      32      +12     
  Lines        1161    2234    +1073     
=========================================
+ Hits          143     164      +21     
- Misses       1014    2067    +1053     
+ Partials        4       3       -1     
Flag Coverage Δ
components-gitpod-cli-app 11.17% <ø> (ø)
components-local-app-app-darwin-amd64 ?
components-local-app-app-darwin-arm64 ?
components-local-app-app-linux-amd64 ?
components-local-app-app-linux-arm64 ?
components-local-app-app-windows-386 ?
components-local-app-app-windows-amd64 ?
components-local-app-app-windows-arm64 ?
install-installer-raw-app 4.27% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
components/local-app/pkg/auth/pkce.go
components/local-app/pkg/auth/auth.go
...components/ws-manager/unpriviledged-rolebinding.go 0.00% <0.00%> (ø)
...nstall/installer/pkg/components/ws-manager/role.go 0.00% <0.00%> (ø)
...staller/pkg/components/ws-manager/networkpolicy.go 0.00% <0.00%> (ø)
...l/installer/pkg/components/ws-manager/configmap.go 23.75% <0.00%> (ø)
.../installer/pkg/components/ws-manager/deployment.go 0.00% <0.00%> (ø)
install/installer/pkg/common/objects.go 0.00% <0.00%> (ø)
install/installer/pkg/common/networkpolicies.go 0.00% <0.00%> (ø)
...installer/pkg/components/ws-manager/rolebinding.go 0.00% <0.00%> (ø)
... and 6 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@roboquat roboquat deleted the fo/fuse branch April 4, 2022 02:09
@roboquat roboquat added the deployed: workspace Workspace team change is running in production label Apr 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: workspace Workspace team change is running in production meta: never-stale This issue can never become stale release-note size/L team: delivery Issue belongs to the self-hosted team team: workspace Issue belongs to the Workspace team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make cgroup customizer support v2
5 participants