Create OLM upgrade e2e scenario using codeflare SDK #286

Srihari1192 · 2023-09-14T07:56:04Z

Issue link

#184

What changes have been made

Added TestMNISTRayClusterUp and TestMnistJobSubmit in OlM upgrade test to run the test before an operator upgrade and after upgrade
Added methods CreateTestNamespaceWithName and DeleteTestNamespace in namespace support class for OLM upgrade tests

Verification steps

Checks

I've made sure the tests are passing.
Testing Strategy
- Unit tests
- Manual tests
- Testing is not required for this change

test/e2e/mnist_rayjob.py

test/e2e/olm_upgrade_test.go

Srihari1192 · 2023-09-21T07:12:15Z

@sutaakar OLM tests are Failing due to lack of resources in KinD cluster and test are pass in local.. I think we can enable this tests only when Large runners are available

sutaakar · 2023-09-21T08:26:32Z

@Srihari1192 Last part of the log:

[notice] A new release of pip available: 22.3 -> 23.2.1
[notice] To update, run: pip install --upgrade pip
Written to: mnist.yaml
╭──────────────────────╮
│   🚀 Cluster Queue   │
│      Status 🚀       │
│ +-------+----------+ │
│ | Name  | Status   | │
│ +=======+==========+ │
│ | mnist | queueing | │
│ |       |          | │
│ +-------+----------+ │
╰──────────────────────╯
Waiting for requested resources to be set up...
No instances found, nothing to be done.
Traceback (most recent call last):
  File "raycluster_sdk.py", line 28, in <module>
    cluster.wait_ready()
  File "/opt/app-root/lib64/python3.8/site-packages/codeflare_sdk/cluster/cluster.py", line 273, in wait_ready
    dashboard_ready = self.is_dashboard_ready()
  File "/opt/app-root/lib64/python3.8/site-packages/codeflare_sdk/cluster/cluster.py", line 255, in is_dashboard_ready
    response = requests.get(self.cluster_dashboard_uri(), timeout=5)
  File "/opt/app-root/lib64/python3.8/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "/opt/app-root/lib64/python3.8/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/opt/app-root/lib64/python3.8/site-packages/requests/sessions.py", line 573, in request
    prep = self.prepare_request(req)
  File "/opt/app-root/lib64/python3.8/site-packages/requests/sessions.py", line 484, in prepare_request
    p.prepare(
  File "/opt/app-root/lib64/python3.8/site-packages/requests/models.py", line 368, in prepare
    self.prepare_url(url, params)
  File "/opt/app-root/lib64/python3.8/site-packages/requests/models.py", line 439, in prepare_url
    raise MissingSchema(
requests.exceptions.MissingSchema: Invalid URL 'None': No scheme supplied. Perhaps you meant http://None?

Can you confirm whether SDK supports using Ingress? If so, can you check that Ingress is properly created when using KinD with Ingress installed in the test setup?

Srihari1192 · 2023-09-21T09:37:32Z

@Srihari1192 Last part of the log:

[notice] A new release of pip available: 22.3 -> 23.2.1
[notice] To update, run: pip install --upgrade pip
Written to: mnist.yaml
╭──────────────────────╮
│   🚀 Cluster Queue   │
│      Status 🚀       │
│ +-------+----------+ │
│ | Name  | Status   | │
│ +=======+==========+ │
│ | mnist | queueing | │
│ |       |          | │
│ +-------+----------+ │
╰──────────────────────╯
Waiting for requested resources to be set up...
No instances found, nothing to be done.
Traceback (most recent call last):
  File "raycluster_sdk.py", line 28, in <module>
    cluster.wait_ready()
  File "/opt/app-root/lib64/python3.8/site-packages/codeflare_sdk/cluster/cluster.py", line 273, in wait_ready
    dashboard_ready = self.is_dashboard_ready()
  File "/opt/app-root/lib64/python3.8/site-packages/codeflare_sdk/cluster/cluster.py", line 255, in is_dashboard_ready
    response = requests.get(self.cluster_dashboard_uri(), timeout=5)
  File "/opt/app-root/lib64/python3.8/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "/opt/app-root/lib64/python3.8/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/opt/app-root/lib64/python3.8/site-packages/requests/sessions.py", line 573, in request
    prep = self.prepare_request(req)
  File "/opt/app-root/lib64/python3.8/site-packages/requests/sessions.py", line 484, in prepare_request
    p.prepare(
  File "/opt/app-root/lib64/python3.8/site-packages/requests/models.py", line 368, in prepare
    self.prepare_url(url, params)
  File "/opt/app-root/lib64/python3.8/site-packages/requests/models.py", line 439, in prepare_url
    raise MissingSchema(
requests.exceptions.MissingSchema: Invalid URL 'None': No scheme supplied. Perhaps you meant http://None?

Can you confirm whether SDK supports using Ingress? If so, can you check that Ingress is properly created when using KinD with Ingress installed in the test setup?

Sure

Srihari1192 · 2023-09-21T10:02:26Z

@Srihari1192 Last part of the log:

[notice] A new release of pip available: 22.3 -> 23.2.1
[notice] To update, run: pip install --upgrade pip
Written to: mnist.yaml
╭──────────────────────╮
│   🚀 Cluster Queue   │
│      Status 🚀       │
│ +-------+----------+ │
│ | Name  | Status   | │
│ +=======+==========+ │
│ | mnist | queueing | │
│ |       |          | │
│ +-------+----------+ │
╰──────────────────────╯
Waiting for requested resources to be set up...
No instances found, nothing to be done.
Traceback (most recent call last):
  File "raycluster_sdk.py", line 28, in <module>
    cluster.wait_ready()
  File "/opt/app-root/lib64/python3.8/site-packages/codeflare_sdk/cluster/cluster.py", line 273, in wait_ready
    dashboard_ready = self.is_dashboard_ready()
  File "/opt/app-root/lib64/python3.8/site-packages/codeflare_sdk/cluster/cluster.py", line 255, in is_dashboard_ready
    response = requests.get(self.cluster_dashboard_uri(), timeout=5)
  File "/opt/app-root/lib64/python3.8/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "/opt/app-root/lib64/python3.8/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/opt/app-root/lib64/python3.8/site-packages/requests/sessions.py", line 573, in request
    prep = self.prepare_request(req)
  File "/opt/app-root/lib64/python3.8/site-packages/requests/sessions.py", line 484, in prepare_request
    p.prepare(
  File "/opt/app-root/lib64/python3.8/site-packages/requests/models.py", line 368, in prepare
    self.prepare_url(url, params)
  File "/opt/app-root/lib64/python3.8/site-packages/requests/models.py", line 439, in prepare_url
    raise MissingSchema(
requests.exceptions.MissingSchema: Invalid URL 'None': No scheme supplied. Perhaps you meant http://None?

Can you confirm whether SDK supports using Ingress? If so, can you check that Ingress is properly created when using KinD with Ingress installed in the test setup?

@sutaakar SDK not supporting Ingress yet.. Implementation is in progress for this project-codeflare/codeflare-sdk#251

test/e2e/mnist_raycluster_sdk.py

test/e2e/olm_upgrade_test.go

sutaakar · 2023-09-25T12:25:40Z

test/e2e/olm_upgrade_test.go

+
+	test := With(t)
+	test.T().Parallel()
+	if os.Getenv("RUN_OLM_TESTS") != "true" {


It may be better to use build tags - https://stackoverflow.com/questions/54165975/go-test-only-run-tests-that-contain-a-build-tag
That way you can specify the OLM upgrade tests when invoking the tests, i.e. go test -tags olm_upgrade_test ...

The disadvantage of this approach is that the file won't compile if the tag is not enabled.....
Thinking whether it may be better just to remove the condition and specify what tests to run in makefile - to have a separate command there to run upgrade tests.

Yes also if we are adding tags, need to adjust to all the e2e tests with build tag to skip these tests running as part of e2e

@sutaakar Probably we can use Test grouping for e2e test run like go test -timeout 30m -v ./test/e2e -run "^TestMNIST.*$" as all our e2e tests starts with TestMNIST and rename OLM upgrade test to TestOLMUpgradeRayClusterUp and TestOLMUpgradeMnistJobSubmit . So that we can call these tests specifically in our workflows by removing the condition

Or maybe this test can be moved to dedicated folder, i.e. test/upgrade.

Sure will go with this approach

Moved the tests to folder test/upgrade.. kept test dependent files in the test/e2e as ReadFile method excepts files to be in the same package

Thinking whether it would have sense to copy the method (doesn't have to be exported) to the upgrade package.

okay left as same as existing

test/e2e/olm_upgrade_test.go

test/support/namespace.go

test/upgrade/olm_upgrade_test.go

sutaakar

/lgtm

sutaakar · 2023-12-07T13:24:38Z

@astefanutti do you have any feedback for this PR, or should we merge it?

astefanutti · 2023-12-07T14:13:46Z

test/upgrade/olm_upgrade_test.go

+	defer func() {
+		if t.Failed() {
+			DeleteTestNamespace(test, namespace)
+		} else {
+			StoreNamespaceLogs(test, namespace)
+		}
+	}()


@Srihari1192 @sutaakar out of curiosity, why not using the "standard" way, where test support does that automatically?

@astefanutti In this Upgrade context, we are using the same namespace in after operator upgrade test. As NewTestNamespace will delete the namespace by default after test complete , so we added this supported methods in codeflare-common

@Srihari1192 Thanks, that's clear now.

Two things I could suggest we could lean on in the future:

Rely on the options argument of the NewTestNamespace method, to provide the name or prevent deletion for example, instead of creating ad-hoc methods. Options enable to mix things.

The test logic seems fragmented between the GH Actions workflow, and the Go tests. It may be better to implement the upgrade as part of the Go test, so it's not necessary to deal with namespace deletion and run tests by name.

It may be better to implement the upgrade as part of the Go test

This will couple the test with specific upgrade strategy (using OLM, overriding existing deployment with new oneliner, upgrade using ODH). Personally I would prefer keep test implementation aside from deployment/upgrade, to keep the test reusable for any strategy.

It makes sense. Sounds good 👍🏼.

astefanutti · 2023-12-08T08:43:18Z

/lgtm

astefanutti · 2023-12-08T08:43:23Z

/approve

openshift-ci · 2023-12-08T08:43:29Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: astefanutti

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [astefanutti]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci bot added the do-not-merge/work-in-progress label Sep 14, 2023

Srihari1192 requested a review from sutaakar September 14, 2023 07:56

Srihari1192 force-pushed the olm-upgrade-e2e-184 branch from a02add7 to 851164e Compare September 15, 2023 07:55

sutaakar reviewed Sep 15, 2023

View reviewed changes

test/e2e/mnist_rayjob.py Outdated Show resolved Hide resolved

sutaakar reviewed Sep 15, 2023

View reviewed changes

test/e2e/olm_upgrade_test.go Outdated Show resolved Hide resolved

sutaakar reviewed Sep 15, 2023

View reviewed changes

test/e2e/olm_upgrade_test.go Outdated Show resolved Hide resolved

Srihari1192 force-pushed the olm-upgrade-e2e-184 branch from a122ebf to 182a440 Compare September 21, 2023 06:43

Srihari1192 marked this pull request as ready for review September 21, 2023 07:04

openshift-ci bot removed the do-not-merge/work-in-progress label Sep 21, 2023

openshift-ci bot requested review from astefanutti and dimakis September 21, 2023 07:04

Srihari1192 requested a review from sutaakar September 21, 2023 07:05

sutaakar reviewed Sep 21, 2023

View reviewed changes

test/e2e/mnist_raycluster_sdk.py Outdated Show resolved Hide resolved

Srihari1192 force-pushed the olm-upgrade-e2e-184 branch from 182a440 to a1432a2 Compare September 21, 2023 12:01

sutaakar reviewed Sep 25, 2023

View reviewed changes

test/e2e/olm_upgrade_test.go Outdated Show resolved Hide resolved

sutaakar reviewed Sep 25, 2023

View reviewed changes

test/e2e/olm_upgrade_test.go Outdated Show resolved Hide resolved

sutaakar reviewed Sep 25, 2023

View reviewed changes

test/e2e/olm_upgrade_test.go Outdated Show resolved Hide resolved

sutaakar reviewed Sep 25, 2023

View reviewed changes

test/e2e/olm_upgrade_test.go Outdated Show resolved Hide resolved

sutaakar reviewed Sep 25, 2023

View reviewed changes

test/e2e/olm_upgrade_test.go Outdated Show resolved Hide resolved

sutaakar reviewed Sep 25, 2023

View reviewed changes

test/e2e/olm_upgrade_test.go Outdated Show resolved Hide resolved

sutaakar reviewed Sep 25, 2023

View reviewed changes

test/support/namespace.go Outdated Show resolved Hide resolved

openshift-merge-robot added the needs-rebase label Sep 27, 2023

Srihari1192 force-pushed the olm-upgrade-e2e-184 branch from dfb97dd to a1432a2 Compare September 27, 2023 14:29

openshift-merge-robot removed the needs-rebase label Sep 27, 2023

Srihari1192 force-pushed the olm-upgrade-e2e-184 branch from 4ee49db to b54e5b5 Compare September 28, 2023 07:19

sutaakar reviewed Oct 5, 2023

View reviewed changes

test/upgrade/olm_upgrade_test.go Outdated Show resolved Hide resolved

Srihari1192 force-pushed the olm-upgrade-e2e-184 branch from d94ed8f to 1a80a6c Compare October 5, 2023 12:54

Srihari1192 marked this pull request as draft October 6, 2023 09:01

openshift-ci bot added the do-not-merge/work-in-progress label Oct 6, 2023

openshift-merge-robot added the needs-rebase label Oct 26, 2023

Srihari1192 mentioned this pull request Nov 22, 2023

Add Namespace support functions project-codeflare/codeflare-common#16

Merged

4 tasks

Srihari1192 force-pushed the olm-upgrade-e2e-184 branch from 1a80a6c to 1bda26e Compare November 23, 2023 06:45

openshift-merge-robot removed the needs-rebase label Nov 23, 2023

Srihari1192 force-pushed the olm-upgrade-e2e-184 branch from 1bda26e to 5cdcb30 Compare November 23, 2023 08:00

Srihari1192 added 3 commits December 4, 2023 13:29

Create OLM upgrade e2e scenario using codeflare SDK

011cc01

Create OLM upgrade e2e scenario using codeflare SDK

c70f257

rebase and resolving conflicts

25d6fa2

Srihari1192 force-pushed the olm-upgrade-e2e-184 branch 2 times, most recently from e7d10ed to 13e88a9 Compare December 4, 2023 11:15

Create e2e scenario covering upgrade during training

f967729

Srihari1192 force-pushed the olm-upgrade-e2e-184 branch from 13e88a9 to f967729 Compare December 4, 2023 11:56

Srihari1192 marked this pull request as ready for review December 5, 2023 12:52

openshift-ci bot removed the do-not-merge/work-in-progress label Dec 5, 2023

openshift-ci bot requested a review from sutaakar December 5, 2023 12:52

sutaakar reviewed Dec 6, 2023

View reviewed changes

openshift-ci bot assigned sutaakar Dec 6, 2023

openshift-ci bot added the lgtm label Dec 6, 2023

astefanutti reviewed Dec 7, 2023

View reviewed changes

openshift-ci bot assigned astefanutti Dec 8, 2023

openshift-ci bot added the approved label Dec 8, 2023

openshift-merge-bot bot merged commit 0afa252 into project-codeflare:main Dec 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create OLM upgrade e2e scenario using codeflare SDK #286

Create OLM upgrade e2e scenario using codeflare SDK #286

Srihari1192 commented Sep 14, 2023 •

edited

Loading

Srihari1192 commented Sep 21, 2023

sutaakar commented Sep 21, 2023

Srihari1192 commented Sep 21, 2023

Srihari1192 commented Sep 21, 2023

sutaakar Sep 25, 2023 •

edited

Loading

sutaakar Sep 26, 2023

Srihari1192 Sep 26, 2023

Srihari1192 Sep 27, 2023

sutaakar Sep 27, 2023

Srihari1192 Sep 27, 2023

Srihari1192 Sep 28, 2023

sutaakar Oct 2, 2023

Srihari1192 Oct 5, 2023

sutaakar left a comment

sutaakar commented Dec 7, 2023

astefanutti Dec 7, 2023

Srihari1192 Dec 8, 2023

astefanutti Dec 8, 2023

sutaakar Dec 8, 2023

astefanutti Dec 8, 2023

astefanutti commented Dec 8, 2023

astefanutti commented Dec 8, 2023

openshift-ci bot commented Dec 8, 2023

Create OLM upgrade e2e scenario using codeflare SDK #286

Create OLM upgrade e2e scenario using codeflare SDK #286

Conversation

Srihari1192 commented Sep 14, 2023 • edited Loading

Issue link

What changes have been made

Verification steps

Checks

Srihari1192 commented Sep 21, 2023

sutaakar commented Sep 21, 2023

Srihari1192 commented Sep 21, 2023

Srihari1192 commented Sep 21, 2023

sutaakar Sep 25, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sutaakar left a comment

Choose a reason for hiding this comment

sutaakar commented Dec 7, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

astefanutti commented Dec 8, 2023

astefanutti commented Dec 8, 2023

openshift-ci bot commented Dec 8, 2023

Srihari1192 commented Sep 14, 2023 •

edited

Loading

sutaakar Sep 25, 2023 •

edited

Loading