Skip to content

Extend cron job to handle VM-based preview envs #9691

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 3, 2022
Merged

Conversation

mads-hartmann
Copy link
Contributor

@mads-hartmann mads-hartmann commented May 2, 2022

Description

This PR extends our platform-delete-preview-environments-cron job to also handle VM-based preview environments.

It introduces a lightweight modelling of our preview environments as type PreviewEnvironment = CoreDevPreviewEnvironment | HarvesterPreviewEnvironment and moves the related code to each class. This should simplify removing CoreDevPreviewEnvironment in the future.

This only checks for staleness of Harvester-based preview environments based on the git branch and not the DB activity in the preview environment for now.

Deleting preview environments seems to sometimes fail. I changed to job so that it will fail the slice for deleting a specific preview environments, but not the entire job. That means we're more likely to get through the job. Any deletion that fails might work on the 2nd or 3rd try (in my experience). In the future we can try to improve the reliability of preview environment deletion, but for now I think it's worth merging the PR as is - I created a quick Honeycomb board to help us visualise failure rates of preview environment deletions here and I can set up a quick experimental trigger if the failure rate is above e.g. 15% so we can be proactive about this.

I also I added a bit of span attributes so we know the counts of preview environments in core-dev and harvester. Thought this would be a fun way to get historic counts and also set up a simple threshold-based trigger in Honeycomb (example query) so we can get a Slack message if we're running say 35 Harvester preview environments ☺️

Lastly, I ended up making the job a bit less noisy slice-wise as it was becoming very hard to make sense of the Werft logs.

Sorry for a doing so much in one PR.

Related Issue(s)

Part of https://github.com/gitpod-io/ops/issues/1713

How to test

During development I ran it in DRY_RUN mode by modifying the variable and side-loading the file werft run github -j .werft/platform-delete-preview-environments-cron.yaml -s .werft/platform-delete-preview-environments-cron.ts

Once I had verified the preview environments were indeed stale I ran it without DRY_RUN using werft run github -j .werft/platform-delete-preview-environments-cron.yaml .

Release Notes

NONE

Documentation

N/A

@mads-hartmann mads-hartmann force-pushed the mads/add-vms-to-cron branch from c4655ca to e223af1 Compare May 3, 2022 14:16
@mads-hartmann mads-hartmann marked this pull request as ready for review May 3, 2022 14:17
@mads-hartmann mads-hartmann requested a review from a team May 3, 2022 14:17
Copy link
Member

@meysholdt meysholdt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes LGTM

@roboquat roboquat merged commit b372985 into main May 3, 2022
@roboquat roboquat deleted the mads/add-vms-to-cron branch May 3, 2022 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants