Skip to content

[installer-preview] add werft flags for triggering self-hosted preview #12229

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 7, 2022

Conversation

nandajavarma
Copy link
Contributor

@nandajavarma nandajavarma commented Aug 19, 2022

Description

This PR adds a werft flag to create self-hosted preview environments from PRs. The idea is that, users will be able to make the following comment to the PR and a werft job will get triggered from the build job (with status written back to Github) creating a preview environment:

/werft run with-sh-preview

The above command will do the following:

  1. Publish the branch to KOTS
  2. Create a replicated license for the KOTS channel created in the previous step
  3. Start a self-hosted preview on a newly created k3s cluster(we are planning to work on reducing waste here)
  4. Delete the created license
  5. The URL of the preview setup will be available under the Results tab of the werft job

If the user wants to test on cluster other than k3s, they can run:

werft run with-sh-preview cluster=aks # for Azure k8s cluster
werft run with-sh-preview cluster=eks # for AWS k8s cluster
werft run with-sh-preview cluster=gke # for GCP k8s cluster

The created cluster will be deleted in 10 hours.

Related Issue(s)

Fixes #7824

How to test

Currently, since this change is not merged with main, one cannot add comments to the PR to trigger the builds. Instead you will have to open a gitpod workspace from this PR and run:

werft run github -j .werft/build.yaml -a with-sh-preview=true -a cluster=k3s
# or
werft run github -j .werft/build.yaml -a with-sh-preview=true -a cluster=aks
# or
werft run github -j .werft/build.yaml -a with-sh-preview=true -a cluster=eks
# or
werft run github -j .werft/build.yaml -a with-sh-preview=true -a cluster=gke

Release Notes

NONE

Documentation

Werft options:

  • /werft with-preview

@nandajavarma
Copy link
Contributor Author

nandajavarma commented Aug 19, 2022

/werft run with-sh-preview cluster=k3s

👍 started the job as gitpod-build-nvn-sh-preview-flag.1
(with .werft/ from main)

@nandajavarma
Copy link
Contributor Author

nandajavarma commented Aug 19, 2022

/werft run with-sh-preview cluster=k3s channel=unstable

👍 started the job as gitpod-build-nvn-sh-preview-flag.2
(with .werft/ from main)

@nandajavarma nandajavarma force-pushed the nvn/sh-preview-flag branch 3 times, most recently from 0d26880 to ee23ab0 Compare August 19, 2022 18:33
@roboquat roboquat added size/L and removed size/M labels Aug 20, 2022
@nandajavarma nandajavarma force-pushed the nvn/sh-preview-flag branch 2 times, most recently from a10a300 to 4a59015 Compare August 20, 2022 20:12
@meysholdt
Copy link
Member

meysholdt commented Aug 22, 2022

hey @nandajavarma ! Thx for this PR - can you give more context on the goal behind it? is this a new way to run automated tests or do you intend to improve dev loops? I'm asking to understand if this is related to https://www.notion.so/gitpod/Gitpod-Preview-Environments-65606379c3724734b28115cc3e48a13c :)

@nandajavarma nandajavarma changed the title [installer-preview] add werft flags for trigger [installer-preview] add werft flags for triggering self-hosted preview Aug 22, 2022
@nandajavarma
Copy link
Contributor Author

Hey @meysholdt ! Yes! it is related to the RFC, yes. The goal of this PR is to do 2 things: 1) being able to start-up preview setups from PRs to improve the dev velocity especially for the self-hosted team and 2) workflow to create and delete replicated licence that can be used to try out newer changes.

Once this PR is merged, folks should be able to comment /werft run with-sh-preview to create self-hosted preview setup on k3s, which will get cleaned up after 10 hours. Goal was to get it merged before the release testing starts. Basically a big part of #7824 will be addressed by this. Let me know if that sheds a bit more light on the PR.

@werft-gitpod-dev-com
Copy link

started the job as gitpod-build-nvn-sh-preview-flag.12 because the annotations in the pull request description changed
(with .werft/ from main)

@nandajavarma nandajavarma marked this pull request as ready for review August 23, 2022 04:10
@nandajavarma nandajavarma requested a review from a team August 23, 2022 04:10
@github-actions github-actions bot added the team: delivery Issue belongs to the self-hosted team label Aug 23, 2022
@adrienthebo
Copy link
Contributor

adrienthebo commented Aug 23, 2022

/werft run with-sh-preview cluster=k3s

👍 started the job as gitpod-build-nvn-sh-preview-flag.14
(with .werft/ from main)

@nandajavarma
Copy link
Contributor Author

@adrienthebo since this is not a part of main yet, the flags as comment doesn't work yet. You will have to follow the How to test section to trigger them

Copy link
Contributor

@adrienthebo adrienthebo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I've found an issue - though it might be outside of the juristiction of this pull request. Error is as follows (from gitpod-self-hosted-installer-tests-head.77)

k3s

Error: Activated service account credentials for: [[email protected]]
Fetching cluster endpoint and auth data.
ERROR: (gcloud.container.clusters.get-credentials) ResponseError: code=404, message=Not found: projects/sh-automated-tests/zones/europe-west1-d/clusters/gp-nvn-sh-prev-k3s.
No cluster named 'gp-nvn-sh-prev-k3s' in sh-automated-tests.
Activated service account credentials for: [[email protected]]
CommandException: No URLs matched: gs://nightly-tests/tf-state/nvn-sh-prev-k3s-kubeconfig
╷
│ Error: Error, failed to create instance sql-nvn-sh-prev-k3s: googleapi: Error 409: The Cloud SQL instance already exists. When you delete an instance, you can't reuse the name of the deleted instance until one week from the deletion date., instanceAlreadyExists
│
│   with module.k3s.google_sql_database_instance.gitpod,
│   on ../infra/modules/k3s/main.tf line 175, in resource "google_sql_database_instance" "gitpod":
│  175: resource "google_sql_database_instance" "gitpod" {
│
╵
make: *** [Makefile:129: k3s-standard-cluster] Error 1

Trying to send slack alert
tracing warning: No slice span by name create-gcp-infra

aks

AKS is failing, again looks unrelated

Error: ╷
│ Error: creating Cluster: (Managed Cluster Name "gitpod-test-nor-primary-nvnshprevaks" / Resource Group "gitpod-test-nor-nvnshprevaks"): containerservice.ManagedClustersClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="UnsupportedAliasMinorVersion" Message="1.21 is not a supported Kubernetes version. To see our supported AKS version list please visit https://docs.microsoft.com/azure/aks/supported-kubernetes-versions?tabs=azure-cli for more details."
│
│   with module.aks.azurerm_kubernetes_cluster.k8s,
│   on ../infra/modules/aks/kubernetes.tf line 17, in resource "azurerm_kubernetes_cluster" "k8s":
│   17: resource "azurerm_kubernetes_cluster" "k8s" {
│
╵
make: *** [Makefile:97: aks-standard-cluster] Error 1

Trying to send slack alert
tracing warning: No slice span by name create-azure-infra

eks

Error: ╷
│ Error: expected length of name_prefix to be in the range (1 - 38), got service-nvn-sh-prev-eks-eks-node-group-
│
│   with module.eks.module.eks.module.eks_managed_node_group["Services"].aws_iam_role.this[0],
│   on .terraform/modules/eks.eks/modules/eks-managed-node-group/main.tf line 426, in resource "aws_iam_role" "this":
│  426:   name_prefix = var.iam_role_use_name_prefix ? "${local.iam_role_name}-" : null
│
╵
make: *** [Makefile:89: eks-standard-cluster] Error 1

Trying to send slack alert
tracing warning: No slice span by name create-aws-infra

GKE

Error: ╷
│ Error: Error, failed to create instance sql-gp-nvn-sh-prev-gke-dBnz: googleapi: Error 400: Invalid request: instance name (sql-gp-nvn-sh-prev-gke-dBnz)., invalid
│
│   with module.gke.google_sql_database_instance.gitpod[0],
│   on ../infra/modules/gke/main.tf line 160, in resource "google_sql_database_instance" "gitpod":
│  160: resource "google_sql_database_instance" "gitpod" {
│
╵
make: *** [Makefile:68: gke-standard-cluster] Error 1

Trying to send slack alert
tracing warning: No slice span by name create-gcp-infra

Copy link
Contributor

@adrienthebo adrienthebo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted in an earlier comment, we have test failures on all platforms that prevent this from reliably working out of the gate. I've opened https://github.com/gitpod-io/gitpod/pull/12321/files targeting this branch with the fixups, let's give those fixes a try (or equivalent changes) and see if that improves the reliability. Looking forward to getting this merged!

@adrienthebo adrienthebo self-requested a review August 24, 2022 13:25
@adrienthebo adrienthebo dismissed their stale review August 24, 2022 13:34

Dismissing blocker as changes have landed so others can potentially land a review

@nandajavarma nandajavarma changed the base branch from nvn/integ-flag to main August 26, 2022 09:50
@nandajavarma nandajavarma changed the base branch from main to nvn/fix-tf-var-names August 26, 2022 09:51
@roboquat roboquat added size/L and removed size/XXL labels Aug 26, 2022
@nandajavarma nandajavarma force-pushed the nvn/sh-preview-flag branch 2 times, most recently from 4ffbce1 to 3d09922 Compare August 26, 2022 12:31
@nandajavarma nandajavarma force-pushed the nvn/fix-tf-var-names branch 2 times, most recently from fc9fc8c to c5c9c9a Compare September 1, 2022 07:45
Base automatically changed from nvn/fix-tf-var-names to main September 1, 2022 07:50
@roboquat roboquat added size/XXL and removed size/L labels Sep 1, 2022
@roboquat roboquat added size/L and removed size/XXL labels Sep 1, 2022
@nandajavarma
Copy link
Contributor Author

@adrienthebo All the dependent PRs has been reviewed and merged. Can we try and merge this one ?

Copy link
Contributor

@mrsimonemms mrsimonemms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold

Looks good. I have one question - if the answer is "no", then feel free to remove the hold and get this merged

@@ -35,4 +35,14 @@ for i in $(gsutil ls gs://nightly-tests/tf-state); do
export TF_VAR_TEST_ID=$TF_VAR_TEST_ID

make cleanup cloud=$cloud

CUSTOMERID=$(replicated customer ls --app "${REPLICATED_APP}" | grep "$TF_VAR_TEST_ID" | awk '{print $1}')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way of making this a function that can be invoked? This is a repetition of what's in .github/workflows/delete-kots-channel.yml

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we ideally could, but I don't know if it is much of an optimization considering it means we will have to pass the API token, license name and in the very near future appname since I have an issue open to use gitpod-dev as published channel. Do you mind if I think about this better when I deal with that issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely - let's get it merged

@mrsimonemms
Copy link
Contributor

/unhold

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none size/L team: delivery Issue belongs to the self-hosted team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automate builds of self-hosted Gitpod for preview environments
5 participants