Skip to content

Commit 9b623d5

Browse files
committed
Merge branch 'main' into rhoaeng-2285
2 parents daf6535 + 80de58e commit 9b623d5

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+929
-4046
lines changed

Diff for: .github/workflows/e2e_tests.yaml

+9-1
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,6 @@ jobs:
8686
make setup-e2e
8787
echo Deploying CodeFlare operator
8888
IMG="${REGISTRY_ADDRESS}"/codeflare-operator
89-
sed -i 's/RayDashboardOAuthEnabled: pointer.Bool(true)/RayDashboardOAuthEnabled: pointer.Bool(false)/' main.go
9089
make image-push -e IMG="${IMG}"
9190
make deploy -e IMG="${IMG}" -e ENV="e2e"
9291
kubectl wait --timeout=120s --for=condition=Available=true deployment -n openshift-operators codeflare-operator-manager
@@ -97,6 +96,9 @@ jobs:
9796
with:
9897
user-name: sdk-user
9998

99+
- name: Add kueue resources
100+
run: kubectl apply --server-side -f "https://github.com/kubernetes-sigs/kueue/releases/download/v0.6.2/manifests.yaml"
101+
100102
- name: Configure RBAC for sdk user with limited permissions
101103
run: |
102104
kubectl create clusterrole list-ingresses --verb=get,list --resource=ingresses
@@ -105,6 +107,12 @@ jobs:
105107
kubectl create clusterrolebinding sdk-user-namespace-creator --clusterrole=namespace-creator --user=sdk-user
106108
kubectl create clusterrole raycluster-creator --verb=get,list,create,delete,patch --resource=rayclusters
107109
kubectl create clusterrolebinding sdk-user-raycluster-creator --clusterrole=raycluster-creator --user=sdk-user
110+
kubectl create clusterrole resourceflavor-creator --verb=get,list,create,delete --resource=resourceflavors
111+
kubectl create clusterrolebinding sdk-user-resourceflavor-creator --clusterrole=resourceflavor-creator --user=sdk-user
112+
kubectl create clusterrole clusterqueue-creator --verb=get,list,create,delete,patch --resource=clusterqueues
113+
kubectl create clusterrolebinding sdk-user-clusterqueue-creator --clusterrole=clusterqueue-creator --user=sdk-user
114+
kubectl create clusterrole localqueue-creator --verb=get,list,create,delete,patch --resource=localqueues
115+
kubectl create clusterrolebinding sdk-user-localqueue-creator --clusterrole=localqueue-creator --user=sdk-user
108116
kubectl config use-context sdk-user
109117
110118
- name: Run e2e tests

Diff for: .github/workflows/odh-notebooks-sync.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ env:
2828

2929
jobs:
3030
build:
31-
runs-on: ubuntu-latest
31+
runs-on: ubuntu-20.04-4core
3232
steps:
3333
- name: Clone repository and Sync
3434
run: |

Diff for: .github/workflows/release.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@ jobs:
106106
gh workflow run odh-notebooks-sync.yml \
107107
--repo ${{ github.event.inputs.codeflare-repository-organization }}/codeflare-sdk \
108108
--ref ${{ github.ref }} \
109-
--field upstream-repository-organization=opendatahub-io
109+
--field upstream-repository-organization=opendatahub-io \
110110
--field codeflare-repository-organization=${{ github.event.inputs.codeflare-repository-organization }} \
111111
--field codeflare_sdk_release_version=${{ github.event.inputs.release-version }}
112112
env:

Diff for: demo-notebooks/additional-demos/hf_interactive.ipynb

+8-5
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@
6868
"id": "bc27f84c",
6969
"metadata": {},
7070
"source": [
71-
"Here, we want to define our cluster by specifying the resources we require for our batch workload. Below, we define our cluster object (which generates a corresponding AppWrapper).\n",
71+
"Here, we want to define our cluster by specifying the resources we require for our batch workload. Below, we define our cluster object (which generates a corresponding Ray Cluster).\n",
7272
"\n",
7373
"NOTE: We must specify the `image` which will be used in our RayCluster, we recommend you bring your own image which suits your purposes. \n",
7474
"The example here is a community image."
@@ -89,25 +89,28 @@
8989
}
9090
],
9191
"source": [
92-
"# Create our cluster and submit appwrapper\n",
92+
"# Create our cluster and submit\n",
93+
"# The SDK will try to find the name of your default local queue based on the annotation \"kueue.x-k8s.io/default-queue\": \"true\" unless you specify the local queue manually below\n",
9394
"cluster = Cluster(ClusterConfiguration(name='hfgputest', \n",
94-
" namespace=\"default\",\n",
95+
" namespace=\"default\", # Update to your namespace\n",
9596
" num_workers=1,\n",
9697
" min_cpus=8, \n",
9798
" max_cpus=8, \n",
9899
" min_memory=16, \n",
99100
" max_memory=16, \n",
100101
" num_gpus=4,\n",
101102
" image=\"quay.io/project-codeflare/ray:latest-py39-cu118\",\n",
102-
" instascale=True, machine_types=[\"m5.xlarge\", \"p3.8xlarge\"]))"
103+
" write_to_file=False, # When enabled Ray Cluster yaml files are written to /HOME/.codeflare/resources \n",
104+
" # local_queue=\"local-queue-name\" # Specify the local queue manually\n",
105+
" ))"
103106
]
104107
},
105108
{
106109
"cell_type": "markdown",
107110
"id": "12eef53c",
108111
"metadata": {},
109112
"source": [
110-
"Next, we want to bring our cluster up, so we call the `up()` function below to submit our cluster AppWrapper yaml onto the MCAD queue, and begin the process of obtaining our resource cluster."
113+
"Next, we want to bring our cluster up, so we call the `up()` function below to submit our Ray Cluster onto the queue, and begin the process of obtaining our resource cluster."
111114
]
112115
},
113116
{

Diff for: demo-notebooks/additional-demos/local_interactive.ipynb

+10-11
Original file line numberDiff line numberDiff line change
@@ -48,13 +48,12 @@
4848
},
4949
"outputs": [],
5050
"source": [
51-
"# Create our cluster and submit appwrapper\n",
52-
"namespace = \"default\"\n",
51+
"# Create and submit our cluster\n",
52+
"# The SDK will try to find the name of your default local queue based on the annotation \"kueue.x-k8s.io/default-queue\": \"true\" unless you specify the local queue manually below\n",
53+
"namespace = \"default\" # Update to your namespace\n",
5354
"cluster_name = \"hfgputest-1\"\n",
54-
"local_interactive = True\n",
5555
"\n",
56-
"cluster = Cluster(ClusterConfiguration(local_interactive=local_interactive,\n",
57-
" namespace=namespace,\n",
56+
"cluster = Cluster(ClusterConfiguration(namespace=namespace,\n",
5857
" name=cluster_name,\n",
5958
" num_workers=1,\n",
6059
" min_cpus=1,\n",
@@ -63,8 +62,9 @@
6362
" max_memory=4,\n",
6463
" num_gpus=0,\n",
6564
" image=\"quay.io/project-codeflare/ray:latest-py39-cu118\",\n",
66-
" instascale=False,\n",
67-
" machine_types=[\"m5.xlarge\", \"p3.8xlarge\"]))"
65+
" write_to_file=False, # When enabled Ray Cluster yaml files are written to /HOME/.codeflare/resources \n",
66+
" # local_queue=\"local-queue-name\" # Specify the local queue manually\n",
67+
" ))"
6868
]
6969
},
7070
{
@@ -117,9 +117,8 @@
117117
"source": [
118118
"from codeflare_sdk import generate_cert\n",
119119
"\n",
120-
"if local_interactive:\n",
121-
" generate_cert.generate_tls_cert(cluster_name, namespace)\n",
122-
" generate_cert.export_env(cluster_name, namespace)"
120+
"generate_cert.generate_tls_cert(cluster_name, namespace)\n",
121+
"generate_cert.export_env(cluster_name, namespace)"
123122
]
124123
},
125124
{
@@ -339,7 +338,7 @@
339338
"name": "python",
340339
"nbconvert_exporter": "python",
341340
"pygments_lexer": "ipython3",
342-
"version": "3.8.17"
341+
"version": "3.9.18"
343342
},
344343
"vscode": {
345344
"interpreter": {

Diff for: demo-notebooks/guided-demos/2_job_client.ipynb renamed to demo-notebooks/additional-demos/ray_job_client.ipynb

+5-4
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"In this third demo we will go over the basics of the Ray Job Submission Client in the SDK"
7+
"In this demo we will go over the basics of the RayJobClient in the SDK"
88
]
99
},
1010
{
@@ -26,7 +26,6 @@
2626
"# Create authentication object for user permissions\n",
2727
"# IF unused, SDK will automatically check for default kubeconfig, then in-cluster config\n",
2828
"# KubeConfigFileAuthentication can also be used to specify kubeconfig path manually\n",
29-
"\n",
3029
"auth_token = \"XXXXX\" # The auth_token is used later for the RayJobClient\n",
3130
"auth = TokenAuthentication(\n",
3231
" token = auth_token,\n",
@@ -43,16 +42,18 @@
4342
"outputs": [],
4443
"source": [
4544
"# Create and configure our cluster object\n",
45+
"# The SDK will try to find the name of your default local queue based on the annotation \"kueue.x-k8s.io/default-queue\": \"true\" unless you specify the local queue manually below\n",
4646
"cluster = Cluster(ClusterConfiguration(\n",
4747
" name='jobtest',\n",
48-
" namespace='default',\n",
48+
" namespace='default', # Update to your namespace\n",
4949
" num_workers=2,\n",
5050
" min_cpus=1,\n",
5151
" max_cpus=1,\n",
5252
" min_memory=4,\n",
5353
" max_memory=4,\n",
5454
" num_gpus=0,\n",
55-
" image=\"quay.io/project-codeflare/ray:latest-py39-cu118\"\n",
55+
" image=\"quay.io/project-codeflare/ray:latest-py39-cu118\",\n",
56+
" write_to_file=False # When enabled Ray Cluster yaml files are written to /HOME/.codeflare/resources \n",
5657
"))"
5758
]
5859
},

Diff for: demo-notebooks/guided-demos/0_basic_ray.ipynb

+8-6
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
"id": "8d4a42f6",
66
"metadata": {},
77
"source": [
8-
"In this first notebook, we will go through the basics of using the SDK to:\n",
8+
"In this notebook, we will go through the basics of using the SDK to:\n",
99
" - Spin up a Ray cluster with our desired resources\n",
1010
" - View the status and specs of our Ray cluster\n",
1111
" - Take down the Ray cluster when finished"
@@ -45,7 +45,7 @@
4545
"id": "bc27f84c",
4646
"metadata": {},
4747
"source": [
48-
"Here, we want to define our cluster by specifying the resources we require for our batch workload. Below, we define our cluster object (which generates a corresponding AppWrapper).\n",
48+
"Here, we want to define our cluster by specifying the resources we require for our batch workload. Below, we define our cluster object (which generates a corresponding RayCluster).\n",
4949
"\n",
5050
"NOTE: We must specify the `image` which will be used in our RayCluster, we recommend you bring your own image which suits your purposes. \n",
5151
"The example here is a community image."
@@ -58,18 +58,20 @@
5858
"metadata": {},
5959
"outputs": [],
6060
"source": [
61-
"# Create and configure our cluster object (and appwrapper)\n",
61+
"# Create and configure our cluster object\n",
62+
"# The SDK will try to find the name of your default local queue based on the annotation \"kueue.x-k8s.io/default-queue\": \"true\" unless you specify the local queue manually below\n",
6263
"cluster = Cluster(ClusterConfiguration(\n",
6364
" name='raytest',\n",
64-
" namespace='default',\n",
65+
" namespace='default', # Update to your namespace\n",
6566
" num_workers=2,\n",
6667
" min_cpus=1,\n",
6768
" max_cpus=1,\n",
6869
" min_memory=4,\n",
6970
" max_memory=4,\n",
7071
" num_gpus=0,\n",
7172
" image=\"quay.io/project-codeflare/ray:latest-py39-cu118\",\n",
72-
" instascale=False\n",
73+
" write_to_file=False, # When enabled Ray Cluster yaml files are written to /HOME/.codeflare/resources \n",
74+
" # local_queue=\"local-queue-name\" # Specify the local queue manually\n",
7375
"))"
7476
]
7577
},
@@ -78,7 +80,7 @@
7880
"id": "12eef53c",
7981
"metadata": {},
8082
"source": [
81-
"Next, we want to bring our cluster up, so we call the `up()` function below to submit our cluster AppWrapper yaml onto the MCAD queue, and begin the process of obtaining our resource cluster."
83+
"Next, we want to bring our cluster up, so we call the `up()` function below to submit our Ray Cluster onto the queue, and begin the process of obtaining our resource cluster."
8284
]
8385
},
8486
{

0 commit comments

Comments
 (0)