Skip to content

feat: set defaults for ignoredUnrecoverableEvents operator config #1310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -118,10 +118,11 @@ type WorkspaceConfig struct {
// "15m", "20s", "1h30m", etc. If not specified, the default value of "5m" is used.
ProgressTimeout string `json:"progressTimeout,omitempty"`
// IgnoredUnrecoverableEvents defines a list of Kubernetes event names that should
// be ignored when deciding to fail a DevWorkspace startup. This option should be used
// if a transient cluster issue is triggering false-positives (for example, if
// the cluster occasionally encounters FailedScheduling events). Events listed
// here will not trigger DevWorkspace failures.
// be ignored when deciding to fail a DevWorkspace startup.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure we need to mention the cluster auto-scaler in DWO (or rewrite the docs here). It might be better to mention this in the Che Cluster CRD documentation, since the ignoredUnrecoverableEvents can be configured from the Che Cluster CRD.

Instead, I would suggest:

  • Mentioning "By default, the FailedScheduling is ignored"
  • Removing the "(for example, if the cluster occasionally encounters FailedScheduling events)" since this example is no longer valid now that the FailedScheduling event is ignored by default

// For example, a FailedScheduling event, that occurs when workspace cannot start
// due to exceeding available resources, should not fail the workspace startup, if there is
// an autoscaler configured on the cluster, and we want to wait until it provisions additional resources.
// FailedScheduling event can also occur as a false-positive, as a result of a transient cluster issue.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest experimenting with kubebuilder annotations for the IgnoredUnrecoverableEvents field.

We should try setting the default array value. I think this would be done with +kubebuilder:default:={"FailedScheduling"}

I believe that should be enough to populate the IgnoredUnrecoverableEvents list in the DWOC. Make sure you re-generate the CRD's in a seperate commit by running: make update_devworkspace_api update_devworkspace_crds generate_all

Something to note: This entire PR might be dropped and re-implemented in Che-Operator if we can get the kubebuilder approach working. We'd want Che admins to see that the FailedSchedling event is ignored by default & there would be no advantages to duplicating this code change in both DWO & Che-Operator (unless users who use DWO in isolation want this feature, however, this is not the current reason why we're resolving #1280).

IgnoredUnrecoverableEvents []string `json:"ignoredUnrecoverableEvents,omitempty"`
// CleanupOnStop governs how the Operator handles stopped DevWorkspaces. If set to
// true, additional resources associated with a DevWorkspace (e.g. services, deployments,
Expand Down
1 change: 1 addition & 0 deletions pkg/config/defaults.go
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ var defaultConfig = &v1alpha1.OperatorConfiguration{
corev1.ResourceMemory: resource.MustParse("64Mi"),
},
},
IgnoredUnrecoverableEvents: []string {"FailedScheduling"},
},
}

Expand Down
Loading