-
Notifications
You must be signed in to change notification settings - Fork 58
Upgrade from AppWrapper v1beta1 to v1beta2 #491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Don't bother reviewing this yet. It will take a few iterations to get in synch with the SDK PR. |
06fe334
to
ce5ae4b
Compare
This PR is now ready for review. The Upgrade test is failing, but assuming I understand what the test is doing, I think that is expected and unavoidable. We explicitly decided we would not attempt to support upgrading from AppWrapper v1beta1 to v1beta2. We can't just do an in place patch of the controller container image because more than that has changed. There are different CRDs and different RBACs. |
861080e
to
2d56a93
Compare
PR ready for review. The PR effectively disables the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! a few questions
#FROM registry.access.redhat.com/ubi8/go-toolset:1.20.10 as builder | ||
|
||
# BEGIN -- workaround lack of go-toolset for golang 1.21 | ||
|
||
# FROM registry-proxy.engineering.redhat.com/rh-osbs/openshift-golang-builder:v1.21 AS golang | ||
FROM golang:1.21 AS golang | ||
FROM registry.access.redhat.com/ubi8/ubi:8.8 AS builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dimakis Will this be okay in our build pipelines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was based on a Dockerfile I got from @astefanutti for Kueue, but his file used a redhat internal image AS golang
that I didn't have access to. I used the default golang:1.21
to get unblocked, but its almost certainly not the right one to use in the long run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, we won't get away with that. we'll need to go back to a RH registry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a RH registry that has the a usable go 1.21 toolchain that is publicly accessible? I couldn't find one. If there is one, I'd be happy to change the docker file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine for upstream, I think. This has to be overwritten anyway downstream.
8e82964
to
9986552
Compare
@KPostOffice -- I've added the webhook and full Kueue support. The PyTorch e2e test is passing, but the Ray e2e test is failing. I'm digging into it, but I don't expect the contents of this PR to change very much as a result. If you have time to look at the changes in 2537d07 and 9986552 and give any feedback you have that would be useful. At first glance, the Ray e2e test looks like a bug in Kueue's generic reconciler loop, but I have to dig deeper to be sure. |
So we aren't blocked on merging the PR, I've added a new e2e RayCluster test that bypasses AppWrappers entirely and submits the RayCluster directly to Kueue. |
Co-authored-by: Antonin Stefanutti <[email protected]>
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Replaced by #543. |
Issue link
What changes have been made
Context: This PR supports the migration of the codeflare operator from MCADv1 to Kueue+AppWrapper
Related PRs:
Changes:
Verification steps
Checks