Add metrics for CEL for admission control KEP #112994

DangerOnTheRanger · 2022-10-11T22:56:05Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR adds metrics, as a part of KEP-3488. An additional PR will be needed to integrate the main KEP implementation with metrics; this PR only introduces the metrics themselves.

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

[KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/3488-cel-admission-control

jpbetz

Curious to see the buckets.

staging/src/k8s.io/apiserver/pkg/admission/cel/metrics.go

staging/src/k8s.io/apiserver/pkg/admission/cel/metrics_test.go

leilajal · 2022-10-13T16:32:53Z

/triage accepted
thanks @DangerOnTheRanger!

staging/src/k8s.io/apiserver/pkg/admission/cel/metrics.go

jpbetz · 2022-10-20T19:44:52Z

/lgtm
/approve

staging/src/k8s.io/apiserver/pkg/admission/cel/metrics.go

logicalhan · 2022-10-20T20:10:15Z

staging/src/k8s.io/apiserver/pkg/admission/cel/metrics.go

+		Buckets:        []float64{0.001, 0.01, 0.1, 1.0},
+		StabilityLevel: metrics.ALPHA,
+	},
+		[]string{"policy", "policy_binding", "validation_expression", "enforcement_action", "params", "state"},


what's the cardinality on "params"?

It should be equal to the number of parameter resouces - my understanding is that number should be no higher than the cardinality of policy_binding (@jpbetz - pinging you here in case I'm off on this one).

Correct, less than or equal to binding cardinality

most of these labels seem like they would have concerningly high cardinality. "policy", "policy_binding", "params" are all names of instances of objects? so the cardinality is only limited to the number of instances of those types as someone creates in a cluster? what is validation_expression, the string expression, name of the validation rule that failed, index of the rule that failed, or something else?

Right, those are all names (validation_expression is the name of the expression being checked). Do you think some of the labels should maybe be removed? I feel like the binding (and maybe the params?) label(s) could be removed without too much impact towards debugging.

My understanding was that metric labels tied to object names were undesirable since they have ~unbounded cardinality.

If we were going to have a name as a label, the binding name identifies the policy/params tuple (at least until singleton policies that are effective without a binding get implemented)

If the label cardinality is truly capped to 100s per cluster, then even if the total cardinality is unbounded, it should not pose a terrible problem. We have far worse issues with the resource label for the apiserver request metrics. It basically just means we're not going to be able to do meaningful aggregation across multiple clusters over these dimensions. If the cardinality for a single cluster is actually unbounded, then I would highly recommend against these labels.

I'd consider this equivalent to the resource label in API server request metrics... every new CRD spawns a new resource value, and every new policy or policy_binding instance (depending on which of these labels was selected) would spawn a new label value

It's probably fine then, moderately discouraged but if it would help debugging then I'd be okay with it.

staging/src/k8s.io/apiserver/pkg/admission/cel/metrics_test.go

logicalhan · 2022-10-24T16:32:58Z

/lgtm

(From sig instrumentation)

staging/src/k8s.io/apiserver/pkg/admission/cel/metrics.go

DangerOnTheRanger · 2022-10-26T20:52:12Z

/retest

logicalhan

/lgtm

cici37 · 2022-10-27T08:27:00Z

/assign @lavalamp

Would you mind to take a look when you have time? Thank you :)

liggitt · 2022-10-27T15:56:22Z

staging/src/k8s.io/apiserver/pkg/admission/cel/metrics.go

+			Help:           "Validation admission policy check total, labeled by policy and param resource, and further identified by binding, validation expression, enforcement action taken, and state.",
+			StabilityLevel: metrics.ALPHA,
+		},
+		[]string{"policy", "policy_binding", "validation_expression", "enforcement_action", "params", "state"},


It looks like this is incremented on every admission, even ones that are allowed. When we get to policy evaluations that can fail but still permit the request (e.g. audit or warn or fail open), how will that be reflected in this metric?

Right, it's incremented on every admission - for audit and warn support, I think we should be able to extend the enforcement_action label with those values.

staging/src/k8s.io/apiserver/pkg/admission/cel/metrics.go

liggitt · 2022-10-27T15:58:58Z

sorry to keep asking questions, just trying to understand how these fit with the shape we expect validation admission evaluations to take

liggitt · 2022-10-28T16:21:30Z

/approve

k8s-ci-robot · 2022-10-28T16:21:55Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: DangerOnTheRanger, jpbetz, liggitt, logicalhan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~staging/src/k8s.io/apiserver/OWNERS~~ [liggitt]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

DangerOnTheRanger · 2022-10-28T18:40:12Z

/retest

DangerOnTheRanger · 2022-10-28T19:33:49Z

/retest

k8s-ci-robot requested review from apelisse and sttts October 11, 2022 22:57

jpbetz reviewed Oct 12, 2022

View reviewed changes

staging/src/k8s.io/apiserver/pkg/admission/cel/metrics.go Outdated Show resolved Hide resolved

staging/src/k8s.io/apiserver/pkg/admission/cel/metrics_test.go Outdated Show resolved Hide resolved

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 13, 2022

DangerOnTheRanger changed the title ~~[WIP] Add metrics for CEL for admission control KEP~~ Add metrics for CEL for admission control KEP Oct 15, 2022

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 15, 2022

jpbetz reviewed Oct 17, 2022

View reviewed changes

staging/src/k8s.io/apiserver/pkg/admission/cel/metrics.go Outdated Show resolved Hide resolved

jpbetz reviewed Oct 19, 2022

View reviewed changes

staging/src/k8s.io/apiserver/pkg/admission/cel/metrics.go Outdated Show resolved Hide resolved

jpbetz reviewed Oct 19, 2022

View reviewed changes

staging/src/k8s.io/apiserver/pkg/admission/cel/metrics.go Outdated Show resolved Hide resolved

jpbetz reviewed Oct 19, 2022

View reviewed changes

staging/src/k8s.io/apiserver/pkg/admission/cel/metrics.go Show resolved Hide resolved

k8s-ci-robot assigned jpbetz Oct 20, 2022

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 20, 2022

logicalhan reviewed Oct 20, 2022

View reviewed changes

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 20, 2022

DangerOnTheRanger force-pushed the validation-admission-metrics branch from 42c28db to 0a65adf Compare October 21, 2022 23:09

k8s-ci-robot assigned logicalhan Oct 24, 2022

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 24, 2022

liggitt reviewed Oct 24, 2022

View reviewed changes

staging/src/k8s.io/apiserver/pkg/admission/cel/metrics.go Outdated Show resolved Hide resolved

cici37 mentioned this pull request Oct 25, 2022

[KEP-3488]Implement CEL for Admission Control #113314

Merged

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 26, 2022

Add metrics for validation admission control.

ac324cb

DangerOnTheRanger force-pushed the validation-admission-metrics branch from 4237698 to ac324cb Compare October 26, 2022 20:55

logicalhan approved these changes Oct 26, 2022

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 26, 2022

k8s-ci-robot assigned lavalamp Oct 27, 2022

liggitt reviewed Oct 27, 2022

View reviewed changes

staging/src/k8s.io/apiserver/pkg/admission/cel/metrics.go Show resolved Hide resolved

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 28, 2022

cici37 mentioned this pull request Oct 28, 2022

CEL in Admission Control Tracking #113440

Closed

9 tasks

k8s-ci-robot merged commit dd3dfab into kubernetes:master Oct 28, 2022

k8s-ci-robot added this to the v1.26 milestone Oct 28, 2022

DangerOnTheRanger mentioned this pull request Oct 31, 2022

Validating admission metrics integration #113475

Merged

jpbetz mentioned this pull request Nov 2, 2022

CEL for Admission Control kubernetes/enhancements#3488

Closed

16 tasks

jpbetz mentioned this pull request Jan 31, 2023

KEP-3488: CEL admission: Add graceful rollout, warning and audit support kubernetes/enhancements#3732

Merged

alexzielenski mentioned this pull request Jul 20, 2023

KEP-3488: Promote ValidatingAdmissionPolicy to Beta #118644

Merged

43 tasks

Add metrics for CEL for admission control KEP #112994

Add metrics for CEL for admission control KEP #112994

Uh oh!

Conversation

DangerOnTheRanger commented Oct 11, 2022

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Uh oh!

jpbetz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

leilajal commented Oct 13, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jpbetz commented Oct 20, 2022

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

logicalhan commented Oct 24, 2022

Uh oh!

Uh oh!

DangerOnTheRanger commented Oct 26, 2022

Uh oh!

logicalhan left a comment

Choose a reason for hiding this comment

Uh oh!

cici37 commented Oct 27, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

liggitt commented Oct 27, 2022

Uh oh!

liggitt commented Oct 28, 2022

Uh oh!

k8s-ci-robot commented Oct 28, 2022

Uh oh!

DangerOnTheRanger commented Oct 28, 2022

Uh oh!

DangerOnTheRanger commented Oct 28, 2022

Uh oh!

Uh oh!