Support for Podman with CAPD #5146

stmcginnis · 2021-08-24T16:03:30Z

User Story

As a user I would like to use the CAPD provider on a system that has podman installed so I can use non-Docker runtimes.

Detailed Description

Raised by @mdbooth. We had previously used calls out to the docker CLI to perform container management in CAPD. For multiple reasons, we wanted to move away from CLI calls to using the API directly. Prior to this, podman could be aliased to docker to get things to work.

Since CAPD only guarantees compatibility with Docker today, this is considered an enhancement, even though it was something that could be worked around by the CLI in the past.

The changes to use the API were done in a way that it should be possible to provide other runtime interfaces. It should be possible to implement podman support by providing an implementation using the podman APIs directly.

Another possibility is the podman apiv2 work. Once implemented, this will provide a native API and a compatibility API that might be able to be used by pointing the docker client at the podman API interface.

/kind feature

mdbooth · 2021-09-02T11:02:25Z

It's possible to work round this using podman system service. I've tested this in CAPO and documented it here: kubernetes-sigs/cluster-api-provider-openstack#982

fabriziopandini · 2021-09-04T13:12:21Z

I'm still wrapping up my mind around the problem, but I think there are many layer on this and we should break down the problem in smaller pieces.

One possible approach is to make different considerations for the E2E framework and CAPD.

CAPD:

uses a big portion/all of our recently introduced container.Runtime interface
it runs in a container, with the only requirement to get a docker socket in input
while recent changes modified the way we consume the docker socket (via API instead of using a copy the CLI embedded in the CAPD container), this does not change the requirement above and aliasing podman wasn't a viable solution even in the past.

I'm personally convinced that for CAPD the container.Runtime interface is a big step forward; we are already seeing improvements in error messages and we are now in a position that allows us to implement much more robust unit tests.

If we need to invest more work in this, IMO the next step is to use the CRI interface, while instead it feels to me that implementing support for podman or any other CLI will be a step back in the evolution of this tool that is crucial for our both our test signal and for the quick start UX.

The E2E framework:

uses a small portion of our recently introduced container.Runtime interface
it runs within the user local environment, and the recent changes switching the way we interact wit the container runtime from CLI calls to socket calls based on Docker API impacted on users aliasing podman
it can be used for providing clusters with CAPD as well without it (using other providers), like the user originally reporting the issue.

Also in this case I consider CRI support the best next option, given that it will embrace users with different runtime engines and be agnostic to the CLI tool they are using.

However, as a middle ground/temporary stop gap trying to help users , I will be also open to consider alternative solutions shelling out for a very specific use case:

test framework running locally on machines with docker CLI (or aliased podman) and executing tests involving infrastructure providers different than CAPD

Ideally this should be addressed by providing a new partial implementation of the container.Runtime, but this probably requires some more research if we decide to go though this path instead of investing in CRI support which probably is a better solution for this problem scope

stmcginnis · 2021-09-04T13:23:00Z

Just to note: I am trying to verify the instructions being added here will work for general CAPD use. Running into some system configuration problems, but if I get by those and that works, it seems a reasonable workaround for now.

mdbooth · 2021-09-05T11:27:26Z

@stmcginnis I'd be interested in adding any practical configuration issues you hit to that documentation. Guessing they're Ubuntu related? If it's generally useful even in the interim you're also welcome to move it to CAPD, in which case I'll replace the CAPO document with a link.

This formed part of an unexpectedly frustrating experience for me. I'll be delighted to help others avoid repeating it!

stmcginnis · 2021-09-06T11:07:21Z

Thanks @mdbooth, I think it would be great to have this documentation with CAPD.

In my case, I was running a default server deployment of RHEL8. It's quite possible I wasn't doing something right. I deployed the OS and it looks like podman was installed by default. I tried following the steps in your CAPO docs patch, but would get an error when trying to do anything using the docker.sock created.

I am going to deploy a fresh install, just to make sure. I'll make sure podman is working, then try to follow the steps. I'm also getting a Fedora system set up.

mdbooth · 2021-09-06T15:03:44Z

With RHEL 8 I also think you'll hit an issue with OverlayFS because the kernel isn't new enough. My first attempt was on CentOS 8 iirc and I hit that.

With podman system service you should see that it creates a unix socket when it starts up. I didn't need any configuration beyond that. Assuming you're not running it as root, you just need to delete the unix socket which was created by the podman-docker package and replace it with a symlink to your unprivileged socket.

stmcginnis · 2021-09-07T14:28:59Z

I'm trying to follow the steps on a Fedora 34 system, and running into an issue even trying to spin up a kind cluster.

First I hit the "short-name" issue that caused the image pull to fail. I was able to get around that with setting short-name-mode="enforcing" based on the comments in kubernetes-sigs/kind#2186. That got past the image pull step. But then fails on the "Writing configuration" step with this:

ERROR: failed to create cluster: failed to generate kubeadm config content: failed to get kubernetes version from node: failed to get file: command "podman exec --privileged kind-control-plane cat /kind/version" failed with error: exit status 255
Command Output: Error: can only create exec sessions on running containers: container state improper

Any missing steps here @mdbooth?

mdbooth · 2021-09-08T10:26:52Z

I just reproduced kind create cluster on a fresh system. Here's everything I did:

Install Fedora 34 cloud image
# dnf update
# Create delegate.conf and iptables.conf with contents from https://kind.sigs.k8s.io/docs/user/rootless/
# Install podman-docker
# Reboot
$ Download kind binary and chmod
$ podman login docker.io
$ podman pull kindest/node:v1.21.1
- Select docker.io from the list
$ ./kind create cluster
Profit

Pulling the image manually seems to work round setting short-name-mode. I must have forgotten that I did that.

Is there any chance that you're hitting docker.io rate limiting? I paid up to work round this, but perhaps we could host these images elsewhere, e.g. quay.io, for the benefit of others? This would be equally beneficial to users of docker.

stmcginnis · 2021-09-08T14:53:03Z

Great, I got this to work. Here's what I did:

Install Fedora 34
sudo dnf update
Create delegate.conf and iptables.conf with contents from https://kind.sigs.k8s.io/docs/user/rootless/
Did NOT install podman-docker. Wanted to see what would happen without that. No problems, so might just need a note to conditionally delete the /var/run/docker.sock file if podman-docker is not installed.
sudo reboot
Download kind binary and chmod
podman pull kindest/node:v1.21.1
- Select docker.io from the list
tmux -c "podman system service -t 0
sudo ln -s /run/user/$(id -u)/podman/podman.sock /var/run/docker.sock
Follow Docker instructions in https://cluster-api.sigs.k8s.io/user/quick-start.html to mount /var/run/docker.sock and create a cluster

We would probably need a note about this only working on newer Red Hat based distros since there is the issue with RHEL 8. Or at least some kind of disclaimer that this is a workaround that has worked in some environments, but it's not officially supported.

I think it would be great to get this added to our docs somewhere. I'm not sure where the best place would be for that though. Maybe under reference? Possibly with a link from a note in the quickstart guide? Any thoughts on this @fabriziopandini?

mdbooth · 2021-09-08T16:43:58Z

Ah, yes. So for just creating a kind cluster and using it via clusterctl you don't need podman-docker because kind has podman support. You don't even need to create the docker.sock, because it won't be used.

However, the test framework explicitly connects to a running docker agent to push locally-built images into the kind cluster (I assume that's what it's doing?), so you do need it to run the E2E tests.

stmcginnis · 2021-09-08T16:48:41Z

I should clarify, I followed the quick start to create a CAPD cluster. So it uses the same mechanism as the e2e framework.

mdbooth · 2021-09-09T08:53:48Z

Are you sure? I was able to start a cluster without podman system service following the quick start guide a few weeks ago, which is why I was surprised that the E2E tests didn't work. I described the problem in #4380 (comment): code in cluster-api/test/framework/bootstrap/kind_util.go explicitly connects to a docker agent. I wouldn't expect us to call this outside of a test suite.

I wonder if I'm missing something important because I don't know what the D in CAPD stands for 🤔

sbueringer · 2021-09-09T09:05:36Z

D stands for Docker :)

But I agree the save func is different to the one in kind (which uses os.Exec). Apart from that I don't think we try to load images into kind in the quickstart.

We don't have to, as the quickstart uses released images. The CAPD-based e2e tests in the cluster-api repo use locally built images and thus has to load them into kind as configured in: https://github.com/kubernetes-sigs/cluster-api/blob/master/test/e2e/config/docker.yaml#L12-L19

mdbooth · 2021-09-09T09:27:56Z

D stands for Docker :)

Suddenly it becomes clear 😂

But I agree the save func is different to the one in kind (which uses os.Exec). Apart from that I don't think we try to load images into kind in the quickstart.

We don't have to, as the quickstart uses released images. The CAPD-based e2e tests in the cluster-api repo use locally built images and thus has to load them into kind as configured in: https://github.com/kubernetes-sigs/cluster-api/blob/master/test/e2e/config/docker.yaml#L12-L19

This was my understanding.

vincepri · 2021-09-30T14:44:44Z

/milestone Next

stmcginnis · 2021-12-13T15:31:23Z

I spent a little time on this to see what it would look like to have podman support with the new runtime layer. I could not get everything working right with just manual testing. I think we may want to support podman for the e2e tests and the initial "setup" parts, but then capd-controller runtime operations to stay with using docker inside the container.

POC code that is not fully working can be found here: https://github.com/stmcginnis/cluster-api/blob/capp/test/infrastructure/container/podman.go

On Fedora, I needed to install podman and podman-docker.

I am able to get a container running that can access the mounted docker.sock, but there are permissions issues unless run with sudo. It also does not appear to return everything if I do a podman ps:

[smcginnis@fedora ~]$ podman ps
CONTAINER ID  IMAGE                           COMMAND               CREATED      STATUS             PORTS       NAMES
02cdda6de05f  docker.io/library/nginx:latest  nginx -g daemon o...  3 hours ago  Up 13 minutes ago              beautiful_galileo
[smcginnis@fedora ~]$ sudo podman run -v /run:/run --privileged --security-opt label=disable quay.io/podman/stable podman --remote ps
CONTAINER ID  IMAGE                         COMMAND               CREATED                 STATUS                     PORTS       NAMES
44cedf65cab8  quay.io/podman/stable:latest  podman --remote p...  Less than a second ago  Up Less than a second ago              sweet_pare

Just adding these notes here in case it helps anyone else looking into this and if anyone wants to pick up this work.

k8s-triage-robot · 2022-03-13T16:03:25Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2022-04-12T16:05:19Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

stmcginnis · 2022-04-12T16:07:16Z

/remove-lifecycle rotten

k8s-triage-robot · 2022-07-11T16:10:05Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2022-08-28T17:23:23Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2022-09-27T17:59:39Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot · 2022-09-27T17:59:42Z

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

vincepri · 2023-08-21T14:31:03Z

/reopen

k8s-ci-robot · 2023-08-21T14:31:08Z

@vincepri: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

vincepri · 2023-08-21T14:34:59Z

What's the latest regarding CAPD + podman support?

@stmcginnis @mdbooth ?

vincepri · 2023-08-21T15:00:48Z

/triage accepted

stmcginnis · 2023-08-22T05:02:17Z

What's the latest regarding CAPD + podman support?

I think if the podman-docker package is installed, it should work. That provided the docker.sock configuration so the API client can communicate with podman via expected API interface.

That said, I have not tested lately, and I am sure there are all kinds of corner cases with rootless and other permission differences. If someone is interested, it would be great to get some real world testing done and report back the results here so we can start to understand what some of those corner cases might be.

vincepri · 2023-11-08T14:37:51Z

Closing this, no activity

/close

k8s-ci-robot · 2023-11-08T14:37:56Z

@vincepri: Closing this issue.

In response to this:

Closing this, no activity

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 24, 2021

stmcginnis mentioned this issue Aug 24, 2021

CAPD: Use library calls to manage containers rather than calling docker #4380

Closed

3 tasks

k8s-ci-robot added this to the Next milestone Sep 30, 2021

fabriziopandini mentioned this issue Nov 29, 2021

🌱 Add test coverage for container runtime calls #5668

Merged

fabriziopandini mentioned this issue Dec 9, 2021

Docker less CAPD 😜 #5836

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 13, 2022

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 12, 2022

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Apr 12, 2022

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 11, 2022

fabriziopandini added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jul 29, 2022

fabriziopandini removed this from the Next milestone Jul 29, 2022

fabriziopandini removed the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jul 29, 2022

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 28, 2022

k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 27, 2022

k8s-ci-robot reopened this Aug 21, 2023

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Aug 21, 2023

vincepri removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Aug 21, 2023

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 21, 2023

k8s-ci-robot closed this as completed Nov 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Podman with CAPD #5146

Support for Podman with CAPD #5146

stmcginnis commented Aug 24, 2021

mdbooth commented Sep 2, 2021

fabriziopandini commented Sep 4, 2021

stmcginnis commented Sep 4, 2021

mdbooth commented Sep 5, 2021

stmcginnis commented Sep 6, 2021

mdbooth commented Sep 6, 2021

stmcginnis commented Sep 7, 2021

mdbooth commented Sep 8, 2021 •

edited

Loading

stmcginnis commented Sep 8, 2021 •

edited

Loading

mdbooth commented Sep 8, 2021

stmcginnis commented Sep 8, 2021

mdbooth commented Sep 9, 2021 •

edited

Loading

sbueringer commented Sep 9, 2021 •

edited

Loading

mdbooth commented Sep 9, 2021

vincepri commented Sep 30, 2021

stmcginnis commented Dec 13, 2021

k8s-triage-robot commented Mar 13, 2022

k8s-triage-robot commented Apr 12, 2022

stmcginnis commented Apr 12, 2022

k8s-triage-robot commented Jul 11, 2022

k8s-triage-robot commented Aug 28, 2022

k8s-triage-robot commented Sep 27, 2022

k8s-ci-robot commented Sep 27, 2022

vincepri commented Aug 21, 2023

k8s-ci-robot commented Aug 21, 2023

vincepri commented Aug 21, 2023 •

edited

Loading

vincepri commented Aug 21, 2023

stmcginnis commented Aug 22, 2023

vincepri commented Nov 8, 2023

k8s-ci-robot commented Nov 8, 2023

Support for Podman with CAPD #5146

Support for Podman with CAPD #5146

Comments

stmcginnis commented Aug 24, 2021

mdbooth commented Sep 2, 2021

fabriziopandini commented Sep 4, 2021

stmcginnis commented Sep 4, 2021

mdbooth commented Sep 5, 2021

stmcginnis commented Sep 6, 2021

mdbooth commented Sep 6, 2021

stmcginnis commented Sep 7, 2021

mdbooth commented Sep 8, 2021 • edited Loading

stmcginnis commented Sep 8, 2021 • edited Loading

mdbooth commented Sep 8, 2021

stmcginnis commented Sep 8, 2021

mdbooth commented Sep 9, 2021 • edited Loading

sbueringer commented Sep 9, 2021 • edited Loading

mdbooth commented Sep 9, 2021

vincepri commented Sep 30, 2021

stmcginnis commented Dec 13, 2021

k8s-triage-robot commented Mar 13, 2022

k8s-triage-robot commented Apr 12, 2022

stmcginnis commented Apr 12, 2022

k8s-triage-robot commented Jul 11, 2022

k8s-triage-robot commented Aug 28, 2022

k8s-triage-robot commented Sep 27, 2022

k8s-ci-robot commented Sep 27, 2022

vincepri commented Aug 21, 2023

k8s-ci-robot commented Aug 21, 2023

vincepri commented Aug 21, 2023 • edited Loading

vincepri commented Aug 21, 2023

stmcginnis commented Aug 22, 2023

vincepri commented Nov 8, 2023

k8s-ci-robot commented Nov 8, 2023

mdbooth commented Sep 8, 2021 •

edited

Loading

stmcginnis commented Sep 8, 2021 •

edited

Loading

mdbooth commented Sep 9, 2021 •

edited

Loading

sbueringer commented Sep 9, 2021 •

edited

Loading

vincepri commented Aug 21, 2023 •

edited

Loading