diff --git a/README.md b/README.md index 0bbdaca161..c0899d7335 100644 --- a/README.md +++ b/README.md @@ -41,6 +41,7 @@ are deleted they are no longer visible on the `/metrics` endpoint. * [Resource group version compatibility](#resource-group-version-compatibility) * [Container Image](#container-image) * [Metrics Documentation](#metrics-documentation) + * [ECMAScript regular expression support for allow and deny lists](#ecmascript-regular-expression-support-for-allow-and-deny-lists) * [Conflict resolution in label names](#conflict-resolution-in-label-names) * [Kube-state-metrics self metrics](#kube-state-metrics-self-metrics) * [Resource recommendation](#resource-recommendation) @@ -129,6 +130,10 @@ e.g. by standardizing Kubernetes labels using an [Admission Webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/) that ensures that there are no possible conflicts. +#### ECMAScript regular expression support for allow and deny lists + +Starting from [#2616](https://github.com/kubernetes/kube-state-metrics/pull/2616/files), kube-state-metrics supports ECMAScript's `regexp` for allow and deny lists. This was incorporated as a workaround for the limitations of the `regexp` package in Go, which does not support lookarounds due to their non-linear time complexity. Please note that while lookarounds are now supported for allow and deny lists, regular expressions' evaluation time is capped at a minute to prevent performance issues. + ### Kube-state-metrics self metrics kube-state-metrics exposes its own general process metrics under `--telemetry-host` and `--telemetry-port` (default 8081). diff --git a/README.md.tpl b/README.md.tpl index 4faa67be78..aba28f02aa 100644 --- a/README.md.tpl +++ b/README.md.tpl @@ -41,6 +41,7 @@ are deleted they are no longer visible on the `/metrics` endpoint. * [Resource group version compatibility](#resource-group-version-compatibility) * [Container Image](#container-image) * [Metrics Documentation](#metrics-documentation) + * [ECMAScript regular expression support for allow and deny lists](#ecmascript-regular-expression-support-for-allow-and-deny-lists) * [Conflict resolution in label names](#conflict-resolution-in-label-names) * [Kube-state-metrics self metrics](#kube-state-metrics-self-metrics) * [Resource recommendation](#resource-recommendation) @@ -130,6 +131,10 @@ e.g. by standardizing Kubernetes labels using an [Admission Webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/) that ensures that there are no possible conflicts. +#### ECMAScript regular expression support for allow and deny lists + +Starting from [#2616](https://github.com/kubernetes/kube-state-metrics/pull/2616/files), kube-state-metrics supports ECMAScript's `regexp` for allow and deny lists. This was incorporated as a workaround for the limitations of the `regexp` package in Go, which does not support lookarounds due to their non-linear time complexity. Please note that while lookarounds are now supported for allow and deny lists, regular expressions' evaluation time is capped at a minute to prevent performance issues. + ### Kube-state-metrics self metrics kube-state-metrics exposes its own general process metrics under `--telemetry-host` and `--telemetry-port` (default 8081). diff --git a/docs/developer/cli-arguments.md b/docs/developer/cli-arguments.md index be402db950..ab092216c3 100644 --- a/docs/developer/cli-arguments.md +++ b/docs/developer/cli-arguments.md @@ -56,9 +56,9 @@ Flags: --log_file string If non-empty, use this log file (no effect when -logtostderr=true) --log_file_max_size uint Defines the maximum size a log file can grow to (no effect when -logtostderr=true). Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800) --logtostderr log to standard error instead of files (default true) - --metric-allowlist string Comma-separated list of metrics to be exposed. This list comprises of exact metric names and/or regex patterns. The allowlist and denylist are mutually exclusive. + --metric-allowlist string Comma-separated list of metrics to be exposed. This list comprises of exact metric names and/or *ECMAScript-based* regex patterns. The allowlist and denylist are mutually exclusive. --metric-annotations-allowlist string Comma-separated list of Kubernetes annotations keys that will be used in the resource' labels metric. By default the annotations metrics are not exposed. To include them, provide a list of resource names in their plural form and Kubernetes annotation keys you would like to allow for them (Example: '=namespaces=[kubernetes.io/team,...],pods=[kubernetes.io/team],...)'. A single '*' can be provided per resource instead to allow any annotations, but that has severe performance implications (Example: '=pods=[*]'). - --metric-denylist string Comma-separated list of metrics not to be enabled. This list comprises of exact metric names and/or regex patterns. The allowlist and denylist are mutually exclusive. + --metric-denylist string Comma-separated list of metrics not to be enabled. This list comprises of exact metric names and/or *ECMAScript-based* regex patterns. The allowlist and denylist are mutually exclusive. --metric-labels-allowlist string Comma-separated list of additional Kubernetes label keys that will be used in the resource' labels metric. By default the labels metrics are not exposed. To include them, provide a list of resource names in their plural form and Kubernetes label keys you would like to allow for them (Example: '=namespaces=[k8s-label-1,k8s-label-n,...],pods=[app],...)'. A single '*' can be provided per resource instead to allow any labels, but that has severe performance implications (Example: '=pods=[*]'). Additionally, an asterisk (*) can be provided as a key, which will resolve to all resources, i.e., assuming '--resources=deployments,pods', '=*=[*]' will resolve to '=deployments=[*],pods=[*]'. --metric-opt-in-list string Comma-separated list of metrics which are opt-in and not enabled by default. This is in addition to the metric allow- and denylists --namespaces string Comma-separated list of namespaces to be enabled. Defaults to "" diff --git a/go.mod b/go.mod index 71b477b41d..bcf66527ac 100644 --- a/go.mod +++ b/go.mod @@ -5,6 +5,7 @@ go 1.24.0 require ( github.com/KimMachineGun/automemlimit v0.7.1 github.com/dgryski/go-jump v0.0.0-20211018200510-ba001c3ffce0 + github.com/dlclark/regexp2 v1.11.5 github.com/fsnotify/fsnotify v1.8.0 github.com/go-logr/logr v1.4.2 github.com/gobuffalo/flect v1.0.3 diff --git a/go.sum b/go.sum index e42f9e758b..61bf54d708 100644 --- a/go.sum +++ b/go.sum @@ -188,6 +188,8 @@ github.com/elazarl/goproxy v0.0.0-20230808193330-2592e75ae04a h1:mATvB/9r/3gvcej github.com/elazarl/goproxy v0.0.0-20230808193330-2592e75ae04a/go.mod h1:Ro8st/ElPeALwNFlcTpWmkr6IoMFfkjXAvTHpevnDsM= github.com/elliotchance/orderedmap/v2 v2.2.0 h1:7/2iwO98kYT4XkOjA9mBEIwvi4KpGB4cyHeOFOnj4Vk= github.com/elliotchance/orderedmap/v2 v2.2.0/go.mod h1:85lZyVbpGaGvHvnKa7Qhx7zncAdBIBq6u56Hb1PRU5Q= +github.com/dlclark/regexp2 v1.11.5 h1:Q/sSnsKerHeCkc/jSTNq1oCm7KiVgUMZRDUoRu0JQZQ= +github.com/dlclark/regexp2 v1.11.5/go.mod h1:DHkYz0B9wPfa6wondMfaivmHpzrQ3v9q8cnmRbL6yW8= github.com/emicklei/go-restful/v3 v3.11.0 h1:rAQeMHw1c7zTmncogyy8VvRZwtkmkZ4FxERmMY4rD+g= github.com/emicklei/go-restful/v3 v3.11.0/go.mod h1:6n3XBCmQQb25CM2LCACGz8ukIrRry+4bhvbpWn3mrbc= github.com/emicklei/proto v1.13.2 h1:z/etSFO3uyXeuEsVPzfl56WNgzcvIr42aQazXaQmFZY= diff --git a/pkg/allowdenylist/allowdenylist.go b/pkg/allowdenylist/allowdenylist.go index bf6a427d53..0dada728ed 100644 --- a/pkg/allowdenylist/allowdenylist.go +++ b/pkg/allowdenylist/allowdenylist.go @@ -18,13 +18,24 @@ package allowdenylist import ( "errors" - "regexp" "strings" + "sync" + "time" + + regexp "github.com/dlclark/regexp2" + "k8s.io/klog/v2" generator "k8s.io/kube-state-metrics/v2/pkg/metric_generator" ) -// AllowDenyList encapsulates the logic needed to filter based on a string. +// Use ECMAScript as the default regexp spec to support lookarounds (#2594). +var ( + once sync.Once + regexpDefaultSpec regexp.RegexOptions = regexp.ECMAScript + regexpDefaultTimeout = time.Minute +) + +// AllowDenyList namespaceencapsulates the logic needed to filter based on a string. type AllowDenyList struct { list map[string]struct{} rList []*regexp.Regexp @@ -34,6 +45,9 @@ type AllowDenyList struct { // New constructs a new AllowDenyList based on a allow- and a // denylist. Only one of them can be not empty. func New(allow, deny map[string]struct{}) (*AllowDenyList, error) { + once.Do(func() { + regexp.DefaultMatchTimeout = regexpDefaultTimeout + }) if len(allow) != 0 && len(deny) != 0 { return nil, errors.New( "allowlist and denylist are both set, they are mutually exclusive, only one of them can be set", @@ -62,7 +76,7 @@ func New(allow, deny map[string]struct{}) (*AllowDenyList, error) { func (l *AllowDenyList) Parse() error { regexes := make([]*regexp.Regexp, 0, len(l.list)) for item := range l.list { - r, err := regexp.Compile(item) + r, err := regexp.Compile(item, regexpDefaultSpec) if err != nil { return err } @@ -99,25 +113,36 @@ func (l *AllowDenyList) Exclude(items []string) { } // IsIncluded returns if the given item is included. -func (l *AllowDenyList) IsIncluded(item string) bool { - var matched bool +func (l *AllowDenyList) IsIncluded(item string) (bool, error) { + var ( + matched bool + err error + ) for _, r := range l.rList { - matched = r.MatchString(item) + matched, err = r.MatchString(item) + if err != nil { + return false, err + } if matched { break } } if l.isAllowList { - return matched + return matched, nil } - return !matched + return !matched, nil } // IsExcluded returns if the given item is excluded. -func (l *AllowDenyList) IsExcluded(item string) bool { - return !l.IsIncluded(item) +func (l *AllowDenyList) IsExcluded(item string) (bool, error) { + isIncluded, err := l.IsIncluded(item) + if err != nil { + return false, err + } + + return !isIncluded, nil } // Status returns the status of the AllowDenyList that can e.g. be passed into @@ -137,7 +162,13 @@ func (l *AllowDenyList) Status() string { // Test returns if the given family generator passes (is included in) the AllowDenyList func (l *AllowDenyList) Test(generator generator.FamilyGenerator) bool { - return l.IsIncluded(generator.Name) + isIncluded, err := l.IsIncluded(generator.Name) + if err != nil { + klog.ErrorS(err, "Error while processing allow-deny entries for generator", "generator", generator.Name) + return false + } + + return isIncluded } func copyList(l map[string]struct{}) map[string]struct{} { diff --git a/pkg/allowdenylist/allowdenylist_test.go b/pkg/allowdenylist/allowdenylist_test.go index 1199dc1521..43d9250dfb 100644 --- a/pkg/allowdenylist/allowdenylist_test.go +++ b/pkg/allowdenylist/allowdenylist_test.go @@ -17,8 +17,12 @@ limitations under the License. package allowdenylist import ( - "regexp" + "fmt" + "strings" "testing" + "time" + + regexp "github.com/dlclark/regexp2" ) func TestNew(t *testing.T) { @@ -76,7 +80,11 @@ func TestInclude(t *testing.T) { t.Fatal("expected Parse() to not fail") } - if !allowlist.IsIncluded("item1") { + isIncluded, err := allowlist.IsIncluded("item1") + if err != nil { + t.Fatal("expected IsIncluded() to not fail") + } + if !isIncluded { t.Fatal("expected included item to be included") } }) @@ -93,7 +101,11 @@ func TestInclude(t *testing.T) { t.Fatalf("expected Parse() to not fail, but got error : %v", err) } - if !denylist.IsIncluded(item1) { + isIncluded, err := denylist.IsIncluded(item1) + if err != nil { + t.Fatal("expected IsIncluded() to not fail") + } + if !isIncluded { t.Fatal("expected included item to be included") } }) @@ -103,13 +115,17 @@ func TestInclude(t *testing.T) { t.Fatal("expected New() to not fail") } - allowlist.Include([]string{"kube_.*_info"}) + allowlist.Include([]string{"kube_(?=secret).*_info"}) err = allowlist.Parse() if err != nil { t.Fatalf("expected Parse() to not fail, but got error : %v", err) } - if !allowlist.IsIncluded("kube_secret_info") { + isIncluded, err := allowlist.IsIncluded("kube_secret_info") + if err != nil { + t.Fatal("expected IsIncluded() to not fail") + } + if !isIncluded { t.Fatal("expected included item to be included") } }) @@ -124,22 +140,38 @@ func TestInclude(t *testing.T) { t.Fatal("expected New() to not fail") } - denylist.Exclude([]string{"kube_node_.*_cores", "kube_pod_.*_bytes"}) + denylist.Exclude([]string{"kube_(?=node.*cores|pod.*bytes)"}) err = denylist.Parse() if err != nil { t.Fatalf("expected Parse() to not fail, but got error : %v", err) } - if denylist.IsExcluded(item1) { + isExcluded, err := denylist.IsExcluded(item1) + if err != nil { + t.Fatal("expected IsExcluded() to not fail") + } + if isExcluded { t.Fatalf("expected included %s to be included", item1) } - if denylist.IsIncluded(item2) { + isIncluded, err := denylist.IsIncluded(item2) + if err != nil { + t.Fatal("expected IsIncluded() to not fail") + } + if isIncluded { t.Fatalf("expected included %s to be excluded", item2) } - if denylist.IsIncluded(item3) { + isIncluded, err = denylist.IsIncluded(item3) + if err != nil { + t.Fatal("expected IsIncluded() to not fail") + } + if isIncluded { t.Fatalf("expected included %s to be excluded", item3) } - if denylist.IsExcluded(item4) { + isExcluded, err = denylist.IsExcluded(item4) + if err != nil { + t.Fatal("expected IsExcluded() to not fail") + } + if isExcluded { t.Fatalf("expected included %s to be included", item4) } }) @@ -159,7 +191,11 @@ func TestExclude(t *testing.T) { t.Fatalf("expected Parse() to not fail, but got error : %v", err) } - if allowlist.IsIncluded(item1) { + isIncluded, err := allowlist.IsIncluded(item1) + if err != nil { + t.Fatal("expected IsIncluded() to not fail") + } + if isIncluded { t.Fatal("expected excluded item to be excluded") } }) @@ -176,7 +212,11 @@ func TestExclude(t *testing.T) { t.Fatalf("expected Parse() to not fail, but got error : %v", err) } - if denylist.IsIncluded(item1) { + isIncluded, err := denylist.IsIncluded(item1) + if err != nil { + t.Fatal("expected IsIncluded() to not fail") + } + if isIncluded { t.Fatal("expected excluded item to be excluded") } }) @@ -224,7 +264,8 @@ func TestStatus(t *testing.T) { allowlist, _ := New(map[string]struct{}{item1: {}, item2: {}}, map[string]struct{}{}) actualStatusString := allowlist.Status() expectedRegexPattern := `^Including the following lists that were on allowlist: (item1|item2), (item2|item1)$` - matched, _ := regexp.MatchString(expectedRegexPattern, actualStatusString) + re := regexp.MustCompile(expectedRegexPattern, regexpDefaultSpec) + matched, _ := re.MatchString(actualStatusString) if !matched { t.Errorf("expected status %q but got %q", expectedRegexPattern, actualStatusString) } @@ -244,9 +285,38 @@ func TestStatus(t *testing.T) { denylist, _ := New(map[string]struct{}{}, map[string]struct{}{item1: {}, item2: {}}) actualStatusString := denylist.Status() expectedRegexPattern := `^Excluding the following lists that were on denylist: (item1|item2), (item2|item1)$` - matched, _ := regexp.MatchString(expectedRegexPattern, actualStatusString) + re := regexp.MustCompile(expectedRegexPattern, regexpDefaultSpec) + matched, _ := re.MatchString(actualStatusString) if !matched { t.Errorf("expected status %q but got %q", expectedRegexPattern, actualStatusString) } }) } + +func TestCatastrophicBacktrackTimeout(t *testing.T) { + r, err := regexp.Compile("(.+)*\\?", 0) + if err != nil { + t.Fatal(err) + } + var exp = "Lorem ipsum dolor sit amet, consectetur adipiscing elit" + exp = strings.Repeat(exp, 2^10) + + timeout := regexpDefaultTimeout + t.Logf("regexp.DefaultMatchTimeout set to: %v", timeout) + buffer := 500 * time.Millisecond + t.Run(fmt.Sprint(timeout), func(t *testing.T) { + r.MatchTimeout = timeout + start := time.Now() + _, err = r.FindStringMatch(exp) + if err != nil && !strings.HasPrefix(err.Error(), "match timeout") { + t.Fatal(err) + } + if err == nil { + t.Fatal("expected catastrophic backtracking error") + } + elapsed := time.Since(start) + if elapsed > timeout+buffer { + t.Fatalf("timeout %v exceeded: %v", timeout, elapsed) + } + }) +} diff --git a/pkg/options/options.go b/pkg/options/options.go index 2b601bb439..95af22b55f 100644 --- a/pkg/options/options.go +++ b/pkg/options/options.go @@ -162,8 +162,8 @@ func (o *Options) AddFlags(cmd *cobra.Command) { o.cmd.Flags().StringVar((*string)(&o.Node), "node", "", "Name of the node that contains the kube-state-metrics pod. Most likely it should be passed via the downward API. This is used for daemonset sharding. Only available for resources (pod metrics) that support spec.nodeName fieldSelector. This is experimental.") o.cmd.Flags().Var(&o.AnnotationsAllowList, "metric-annotations-allowlist", "Comma-separated list of Kubernetes annotations keys that will be used in the resource' labels metric. By default the annotations metrics are not exposed. To include them, provide a list of resource names in their plural form and Kubernetes annotation keys you would like to allow for them (Example: '=namespaces=[kubernetes.io/team,...],pods=[kubernetes.io/team],...)'. A single '*' can be provided per resource instead to allow any annotations, but that has severe performance implications (Example: '=pods=[*]').") o.cmd.Flags().Var(&o.LabelsAllowList, "metric-labels-allowlist", "Comma-separated list of additional Kubernetes label keys that will be used in the resource' labels metric. By default the labels metrics are not exposed. To include them, provide a list of resource names in their plural form and Kubernetes label keys you would like to allow for them (Example: '=namespaces=[k8s-label-1,k8s-label-n,...],pods=[app],...)'. A single '*' can be provided per resource instead to allow any labels, but that has severe performance implications (Example: '=pods=[*]'). Additionally, an asterisk (*) can be provided as a key, which will resolve to all resources, i.e., assuming '--resources=deployments,pods', '=*=[*]' will resolve to '=deployments=[*],pods=[*]'.") - o.cmd.Flags().Var(&o.MetricAllowlist, "metric-allowlist", "Comma-separated list of metrics to be exposed. This list comprises of exact metric names and/or regex patterns. The allowlist and denylist are mutually exclusive.") - o.cmd.Flags().Var(&o.MetricDenylist, "metric-denylist", "Comma-separated list of metrics not to be enabled. This list comprises of exact metric names and/or regex patterns. The allowlist and denylist are mutually exclusive.") + o.cmd.Flags().Var(&o.MetricAllowlist, "metric-allowlist", "Comma-separated list of metrics to be exposed. This list comprises of exact metric names and/or *ECMAScript-based* regex patterns. The allowlist and denylist are mutually exclusive.") + o.cmd.Flags().Var(&o.MetricDenylist, "metric-denylist", "Comma-separated list of metrics not to be enabled. This list comprises of exact metric names and/or *ECMAScript-based* regex patterns. The allowlist and denylist are mutually exclusive.") o.cmd.Flags().Var(&o.MetricOptInList, "metric-opt-in-list", "Comma-separated list of metrics which are opt-in and not enabled by default. This is in addition to the metric allow- and denylists") o.cmd.Flags().Var(&o.Namespaces, "namespaces", fmt.Sprintf("Comma-separated list of namespaces to be enabled. Defaults to %q", &DefaultNamespaces)) o.cmd.Flags().Var(&o.NamespacesDenylist, "namespaces-denylist", "Comma-separated list of namespaces not to be enabled. If namespaces and namespaces-denylist are both set, only namespaces that are excluded in namespaces-denylist will be used.")