Skip to content

[observability] Add alerts for pending phase #9675

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 2, 2022
Merged

Conversation

princerachit
Copy link
Contributor

@princerachit princerachit commented May 2, 2022

Description

Add alert which triggers when prebuild/regular workspaces are stuck in pending state. This is a follow up item from incident https://app.incident.io/incidents/143

This alert would have triggered ~15 times in last 30 days.

PREBUILD

image

REGULAR

image

Prefer merging https://github.com/gitpod-io/runbooks/pull/340 before this PR.

Related Issue(s)

Fixes #9674

How to test

NA

Release Notes

None

Documentation

@princerachit princerachit requested a review from a team May 2, 2022 07:24
@github-actions github-actions bot added the team: workspace Issue belongs to the Workspace team label May 2, 2022
labels: {
severity: 'critical',
},
'for': '15m',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is at par with TooManyRegularNotActive alert.

description: 'regular workspaces are stuck in pending phase',
},
expr: |||
gitpod_ws_manager_workspace_phase_total{phase="PENDING", type="REGULAR"} > 20
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose 20, this is based on the trend I saw in last 30 days.

Copy link
Member

@Furisto Furisto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold

Waiting for the related issue to be merged.

@Furisto
Copy link
Member

Furisto commented May 2, 2022

/unhold

@roboquat roboquat merged commit 5045e85 into main May 2, 2022
@roboquat roboquat deleted the prs/pending-alerts branch May 2, 2022 11:09
@roboquat roboquat added deployed: workspace Workspace team change is running in production deployed Change is completely running in production labels May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: workspace Workspace team change is running in production deployed Change is completely running in production release-note-none size/M team: workspace Issue belongs to the Workspace team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[observability] Create an alert when prebuilds are stuck in pending state
3 participants