Skip to content

[server] Add SLO for image-build completed successfully #12960

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 tasks
geropl opened this issue Sep 14, 2022 · 1 comment
Closed
2 tasks

[server] Add SLO for image-build completed successfully #12960

geropl opened this issue Sep 14, 2022 · 1 comment
Assignees
Labels
component: server operations: observability This issue relates to the observability of Gitpod (metrics, logs, traces) type: improvement Improves an existing feature or existing code

Comments

@geropl
Copy link
Member

geropl commented Sep 14, 2022

We are currently lacking insight into how often image-builds are not completed successfully. Especially we suspect, that rollouts of server might lead to updates about image-builds are being lost.
To proof that, and to be able to hunt this down, we should introduce a metric + SLO.

  • Create metric
  • Create SLO
@geropl geropl added component: server type: improvement Improves an existing feature or existing code operations: observability This issue relates to the observability of Gitpod (metrics, logs, traces) labels Sep 14, 2022
@geropl geropl changed the title [server] Add SLO for image-build completeld successfully [server] Add SLO for image-build completed successfully Sep 14, 2022
@jldec jldec moved this to Scheduled in 🍎 WebApp Team Oct 2, 2022
@laushinka laushinka self-assigned this Oct 11, 2022
@laushinka laushinka moved this from Scheduled to In Progress in 🍎 WebApp Team Oct 11, 2022
@laushinka laushinka removed their assignment Oct 17, 2022
@laushinka laushinka moved this from In Progress to Scheduled in 🍎 WebApp Team Oct 17, 2022
@easyCZ
Copy link
Member

easyCZ commented Oct 26, 2022

Just to clarify, the focus is on measuring from the perspective of the server listening for image build status, not from the perspective of image-builder building the image and completing OK?

If that's the case, then the problem we need to solve here can be expressed with the following metrics:

  • Counter of image builds completed as observed by Server
  • Counter of image build completed as observed by Image manager

The metric we want to measure is the diff between these.

As a bonus, I'd recommend also adding:

  • Counter of image builds started as observed by Server
  • Counter of image build started as observed by image manager

These allow us further debug if the request didn't actually reach image builder, or if we just lost the update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: server operations: observability This issue relates to the observability of Gitpod (metrics, logs, traces) type: improvement Improves an existing feature or existing code
Projects
Status: Done
Development

No branches or pull requests

4 participants