|
| 1 | +# Assisted Installer Testing |
| 2 | + |
| 3 | +The assisted-installer tests are divided into 3 categories: |
| 4 | + |
| 5 | +* **Unit tests** - Focused on a module/function level while other modules are mocked. |
| 6 | +Unit tests are located in the package, where the code they are testing resides, using the pattern `<module_name>_test.go`. |
| 7 | + |
| 8 | +* **Subsystem tests** - Focused on the component while mocking other component. |
| 9 | +For example, assisted-service subsystem tests mock the agent responses. |
| 10 | + |
| 11 | +* **System tests** (a.k.a e2e) - Running full flows with all components. |
| 12 | +The e2e tests are divided into u/s (upstream) basic workflows on [assisted-test-infra](https://github.com/openshift/assisted-test-infra/tree/master/discovery-infra/tests) and d/s (downstream) extended regression tests maintained by both DEV and QE teams on [kni-assisted-installer-auto](https://gitlab.cee.redhat.com/ocp-edge-qe/kni-assisted-installer-auto/-/tree/master/api_tests). |
| 13 | + |
| 14 | +## Repository CI |
| 15 | + |
| 16 | +Our CI jobs are currently managed and ran by two CI tools - a Jenkins hosted on <http://assisted-jenkins.usersys.redhat.com> and a Prow hosts on <https://prow.ci.openshift.org>. |
| 17 | + |
| 18 | +| [Jenkins]((http://assisted-jenkins.usersys.redhat.com)) | [Prow](https://prow.ci.openshift.org) | |
| 19 | +|---|---| |
| 20 | +| Local for Assisted ecosystem | Company-wide | |
| 21 | +| Checks comments for JIRA | Runs e2e |
| 22 | +| Manages images in quay.io/ocpmetal | Runs all testing checks (lint, unit, etc) |
| 23 | + |
| 24 | +Assisted-service CI jobs are defined under [openshift/release](https://github.com/openshift/release) repository on [openshift-assisted-service-master.yaml](https://github.com/openshift/release/blob/master/ci-operator/config/openshift/assisted-service/openshift-assisted-service-master.yaml). |
| 25 | +Read more about OpenShift CI infrastructure on [OpenShift CI Docs](https://docs.ci.openshift.org/docs/). |
| 26 | + |
| 27 | +All the currently available jobs for the openshift/assisted-service repository can be viewed on [Openshift CI Step Registry](https://steps.ci.openshift.org/search?job=openshift-assisted-service). |
| 28 | + |
| 29 | +### Adding a new CI job |
| 30 | + |
| 31 | +When adding a new job the following rules of thumbs should be taken into account: |
| 32 | + |
| 33 | +* Test logic needs to be maintained in the repository under test and not under openshift/release. |
| 34 | +It would allow easier integration with other tools, less dependency of the CI infrastructure, and most importantly the availability to run it locally. |
| 35 | + |
| 36 | +* When introducing a new job it should be both a presubmit job and a periodic job. A presubmit job needs to be available so contributors would be able to run it on their PRs before merging. |
| 37 | + |
| 38 | + The presubmit job needs to be configured as `always_run: false` and `optional: true` (not blocking a merge) until proving stability. |
| 39 | + New OCP releases might break one of Assisted workflows since Assisted isn't part of OCP. |
| 40 | + |
| 41 | + The periodic job needs to run on a frequent basis (e.g. daily) and have a `reporter_config` configured, in order to be notified on Slack whenever there's a breakage. |
| 42 | + |
| 43 | +* In case the new job affects multiple repositories - every repository should have the same presubmit job so it could be tested for every component change. |
| 44 | +For example, you can see that the `e2e-metal-assisted-olm` job is defined on several different repositories in this [link](https://steps.ci.openshift.org/search?job=e2e-metal-assisted-olm). |
| 45 | + |
| 46 | +[An example of a PR adding a new job](https://github.com/openshift/release/pull/21604) |
| 47 | + |
| 48 | +### FAQ |
| 49 | + |
| 50 | +#### **How can I debug CI failures?** |
| 51 | + |
| 52 | +A CI job can be debugged only in runtime. |
| 53 | +Once a job terminates it can no longer be debugged because the cluster / machines used to run the job get torn down at the end of it. |
| 54 | + |
| 55 | +However, each job produces artifacts such as (logs, SOS reports, must-gather logs) which can be used to try to analyze what went wrong in retrospect. Those artifacts can be accessed by going to the job artifacts. |
| 56 | + |
| 57 | +You can follow the [OpenShift CI doc "Interact With Running Jobs"](https://docs.ci.openshift.org/docs/how-tos/interact-with-running-jobs/) guide or try to run the experimental [Debug Prow Jobs live](https://gist.github.com/omertuc/1ef4bdf22f0fedfbde46cf1feb149bb9) gist in order to connect to the OCP cluster running your prow tests. Contributions are welcome. :) |
| 58 | + |
| 59 | +#### **If a CI job fails, where should I look for assisted-related failures?** |
| 60 | + |
| 61 | +When a PR job fails there's a "details" button next to the GitHub context. It will show the [Spyglass](https://github.com/kubernetes/test-infra/tree/master/prow/spyglass) view. In there, you should look if there are other builds that failed recently for the same job using the "Job History" button. |
| 62 | + |
| 63 | +#### **When is it ok to retest?** |
| 64 | + |
| 65 | +Whenever there's a failure - first, you should look for its root cause before hitting the "/retest" command. |
| 66 | +It should only be used when there's a known flaky issue. |
| 67 | +Using the retest feature for no reason just wastes the project CI resources and money. |
| 68 | + |
| 69 | +#### **How does the retest bot works?** |
| 70 | + |
| 71 | +When a PR is ready to be merged (approved and not held) all the jobs will be retested for every new master that's being updated. In case any of the job fails - the openshift-bot will try to retest it automatically. |
| 72 | +The retest job is defined under [infra-periodics.yaml](https://github.com/openshift/release/blob/c121e55f68fb37af41d7cd16877eaa79eeb972f1/ci-operator/jobs/infra-periodics.yaml#L202-L241) |
| 73 | + |
| 74 | +#### **Where do these jobs run?** |
| 75 | + |
| 76 | +Depends on the job. |
| 77 | + |
| 78 | +* Single-stage tests (e.g. lint, unit tests) run inside of a scheduled container. [Read more](https://docs.ci.openshift.org/docs/architecture/ci-operator/#declaring-tests) |
| 79 | +* Jobs that require a cluster (e.g. subsystem) run on a claimed OCP cluster from an hibernated pool of clusters. |
| 80 | +[Read more](https://docs.ci.openshift.org/docs/architecture/ci-operator/#testing-with-a-cluster-from-a-cluster-pool) |
| 81 | +* Baremetal jobs (i.e. e2e) run on a provisioned baremetal machine by [Equnix](https://www.equinix.nl/). |
| 82 | + |
| 83 | +## How to run Assisted-service subsystem tests |
| 84 | + |
| 85 | +More information is available here: [Assisted Installer Testing](docs/dev/testing.md). |
0 commit comments