Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MCO-1521: Promote PinnedImageSet to GA #2198

Merged
merged 1 commit into from
Mar 20, 2025

Conversation

RishabhSaini
Copy link
Contributor

@RishabhSaini RishabhSaini commented Feb 11, 2025

Promotes machineconfiguration/v1alpha1 PinnedImageSet API to machineconfiguration/v1 PIS API

Copy link
Contributor

openshift-ci bot commented Feb 11, 2025

Hello @RishabhSaini! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@openshift-ci openshift-ci bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Feb 11, 2025
@RishabhSaini RishabhSaini changed the title Promote PinnedImageSet to GA MCO-1521: Promote PinnedImageSet to GA Feb 11, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Feb 11, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Feb 11, 2025

@RishabhSaini: This pull request references MCO-1521 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

Opened for testing

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@RishabhSaini RishabhSaini force-pushed the pis branch 3 times, most recently from 4382c84 to c6dc790 Compare February 11, 2025 23:19
@RishabhSaini RishabhSaini force-pushed the pis branch 2 times, most recently from 75b4ebc to cc6099e Compare February 20, 2025 17:27
Copy link
Contributor

@yuqi-zhang yuqi-zhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think this is ready, some questions/suggestions inline

@RishabhSaini
Copy link
Contributor Author

Here are some outstanding question after our discussion with Sam on this:

  1. What are some PIS Status.Conditions needed?
  2. Should we have PIS Status Conditions defined explicitly as constants in this API or should we have them as Go objects instead and populate them somewhere in the MCO
  3. How would we differentiate the Status reporting between PIS and MCN?
  4. For the MCN tracking the PIS in MachineConfigNodeStatusPinnedImageSet, we might not be using LastFailedGeneration and LastFailedGenerationErrors as intended, and so what should they be changed to?

@cgwalters
Copy link
Member

Note there's some overlap between PIS and bootc's logically bound images especially as it relates to OCL (lots of TLAs here!).

A neat advantage of LBIs is that they live underneath a bootc-owned readonly bind mount, so even manually running podman image rm won't work, whereas AFAIK today podman is totally unaware of crio's attempt to pin and can still remove (although I could be wrong, maybe crio makes a running stub container at least to block that?).

@RishabhSaini
Copy link
Contributor Author

RishabhSaini commented Feb 24, 2025

Ok just verified, Pinned Images defined by CRIO can be deleted by Podman. Despite both using the storage as containers-storage, they don't seem to be aware of the pinned Images. Either a lower level mechanism in container/storage is needed or we could explore LBI's

[root@crc ~]# cat /etc/crio/crio.conf.d/50-pinned-images && crictl images --digests --pinned | grep true
[crio]
  [crio.image]
    pinned_images = ["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:479f8a99cbe432551448776965aac1f44501c08aa01539d77ab5976fdbbe1c83", "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:86d26e7ebcccd6f07a75db5b1e56283b25c2ee1c6a755d6ffc5a4d59beb9c504", "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e43b2ef4fbc42dbcbea5d67f57f3feed38f6b45fb712c99acb06490103e277a9"]
quay.io/openshift-release-dev/ocp-v4.0-art-dev                                 <none>              3782924b74905       a3a9b9964565f       378MB               true
quay.io/openshift-release-dev/ocp-v4.0-art-dev                                 <none>              479f8a99cbe43       5f2e4eb4b7324       519MB               true
quay.io/openshift-release-dev/ocp-v4.0-art-dev                                 <none>              86d26e7ebcccd       7948809a7f475       672MB               true
quay.io/openshift-release-dev/ocp-v4.0-art-dev                                 <none>              e43b2ef4fbc42       6d27b7ed537e2       826MB               true

[root@crc ~]# podman rmi quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:479f8a99cbe432551448776965aac1f44501c08aa01539d77ab5976fdbbe1c83
Untagged: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:479f8a99cbe432551448776965aac1f44501c08aa01539d77ab5976fdbbe1c83
Deleted: 5f2e4eb4b7324f29726100dda917d601d6ac822699301c22a63cf326c73330cf

@RishabhSaini
Copy link
Contributor Author

created a bug here: https://issues.redhat.com/browse/OCPBUGS-51284

@hexfusion
Copy link
Contributor

Ok just verified, Pinned Images defined by CRIO can be deleted by Podman. Despite both using the storage as containers-storage, they don't seem to be aware of the pinned Images. Either a lower level mechanism in container/storage is needed or we could explore LBI's

[root@crc ~]# cat /etc/crio/crio.conf.d/50-pinned-images && crictl images --digests --pinned | grep true
[crio]
  [crio.image]
    pinned_images = ["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:479f8a99cbe432551448776965aac1f44501c08aa01539d77ab5976fdbbe1c83", "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:86d26e7ebcccd6f07a75db5b1e56283b25c2ee1c6a755d6ffc5a4d59beb9c504", "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e43b2ef4fbc42dbcbea5d67f57f3feed38f6b45fb712c99acb06490103e277a9"]
quay.io/openshift-release-dev/ocp-v4.0-art-dev                                 <none>              3782924b74905       a3a9b9964565f       378MB               true
quay.io/openshift-release-dev/ocp-v4.0-art-dev                                 <none>              479f8a99cbe43       5f2e4eb4b7324       519MB               true
quay.io/openshift-release-dev/ocp-v4.0-art-dev                                 <none>              86d26e7ebcccd       7948809a7f475       672MB               true
quay.io/openshift-release-dev/ocp-v4.0-art-dev                                 <none>              e43b2ef4fbc42       6d27b7ed537e2       826MB               true

[root@crc ~]# podman rmi quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:479f8a99cbe432551448776965aac1f44501c08aa01539d77ab5976fdbbe1c83
Untagged: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:479f8a99cbe432551448776965aac1f44501c08aa01539d77ab5976fdbbe1c83
Deleted: 5f2e4eb4b7324f29726100dda917d601d6ac822699301c22a63cf326c73330cf

LBI isn't released yet and would be implementation details. I don't see that blocking the API promotion.

Copy link
Contributor

@yuqi-zhang yuqi-zhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on our discussions I think we don't need any changes to the pinnedimageset API, so from the MCO side I'm happy to make this v1. The one API change we discussed is for disconnected, registryless users to leverage PIS to pull images, which should be a user-side determination via some higher level API. This should not require any changes here.

@hexfusion any remaining concerns from your end?

@hexfusion
Copy link
Contributor

Based on our discussions I think we don't need any changes to the pinnedimageset API, so from the MCO side I'm happy to make this v1. The one API change we discussed is for disconnected, registryless users to leverage PIS to pull images, which should be a user-side determination via some higher level API. This should not require any changes here.

@hexfusion any remaining concerns from your end?

No changes from my end.

Copy link
Contributor

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious for this API how the status of this API is being used, can you provide some examples of failures to pin images and how they manifest within a cluster?

@RishabhSaini
Copy link
Contributor Author

RishabhSaini commented Mar 17, 2025

Notes from the PIS Weekly Sync:

  1. PIS Status
    An idea for having a PIS Status could be: aggregating the MCN status for a Pool and reporting it in PIS
    Since status for an individual node is still visible in MCN, we could pursue this post GA

  2. Bugs
    https://issues.redhat.com/browse/OCPBUGS-35199
    A cluster-wide signal is needed to identify a disconnected, no-registry case.
    This will also be handled post GA

  3. Testing
    MCO-1523: PinnedImageSet v1alpha1 Testing origin#29599
    This is a blocker for component readiness actively being worked on currently

@RishabhSaini
Copy link
Contributor Author

RishabhSaini commented Mar 17, 2025

I'm curious for this API how the status of this API is being used, can you provide some examples of failures to pin images and how they manifest within a cluster?

Currently failure states for each Node is reported by the help of MachineConfigNode's Status.Conditions ("PinnedImageSetsProgressing" and "PinnedImageSetsDegraded")
https://github.com/openshift/api/pull/2171/files#diff-2cd3dcf78a4a4b634894a2e602c8baaee011c74bd724bbfd591ff355b50b6224R274-R277

Here are the possible conditions for a "degraded" status:
https://github.com/openshift/machine-config-operator/blob/13ad337e5ae9399c305269daaa22e09a610d61a6/pkg/daemon/pinned_image_set.go#L79-L82

@RishabhSaini
Copy link
Contributor Author

/test unit verify-client-go

@RishabhSaini RishabhSaini force-pushed the pis branch 2 times, most recently from 3c752dd to 3be26b2 Compare March 18, 2025 16:05
@JoelSpeed
Copy link
Contributor

JoelSpeed commented Mar 19, 2025

Please update the PR description to be accurate based on what this PR is trying to achieve

Otherwise this LGTM and we can get this merged.

Question though, do we have PIS testing that supports the feature being promoted? Are we likely to get that ready for 4.19?

It may be prevalent, if we don't think we will actually promote this API in this release to hold off merging this PR until we are ready to start promotion

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 19, 2025

@RishabhSaini: This pull request references MCO-1521 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

Promotes machineconfiguration/v1alpha1 PinnedImageSet API to machineconfiguration/v1 PIS API

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@RishabhSaini
Copy link
Contributor Author

RishabhSaini commented Mar 19, 2025

Question though, do we have PIS testing that supports the feature being promoted? Are we likely to get that ready for 4.19?

Yes we have openshift/origin#29599 open with intention to be done by 4.19

@JoelSpeed
Copy link
Contributor

Ack, would you feel comfortable getting those merged, and then we can demonstrate the tests are showing up correctly in component readiness before we move forward here? Or does that create a lot of toil for you?

@RishabhSaini
Copy link
Contributor Author

Ack, would you feel comfortable getting those merged, and then we can demonstrate the tests are showing up correctly in component readiness before we move forward here? Or does that create a lot of toil for you?

I do not have a strong preference. However, I would prefer merging this prior if that's possible.

@JoelSpeed
Copy link
Contributor

/lgtm

Please make sure this feature ships in 4.19, else we should revert this PR out of the 4.19 branch

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 20, 2025
Copy link
Contributor

openshift-ci bot commented Mar 20, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JoelSpeed, RishabhSaini, yuqi-zhang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 20, 2025
Copy link
Contributor

openshift-ci bot commented Mar 20, 2025

@RishabhSaini: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 3aa9dd5 into openshift:master Mar 20, 2025
12 checks passed
@openshift-bot
Copy link

[ART PR BUILD NOTIFIER]

Distgit: ose-cluster-config-api
This PR has been included in build ose-cluster-config-api-container-v4.19.0-202503201644.p0.g3aa9dd5.assembly.stream.el9.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants