Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Use dlclark/regexp2 over standard library's package #2616

Merged
merged 1 commit into from
Mar 19, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ are deleted they are no longer visible on the `/metrics` endpoint.
* [Resource group version compatibility](#resource-group-version-compatibility)
* [Container Image](#container-image)
* [Metrics Documentation](#metrics-documentation)
* [ECMAScript regular expression support for allow and deny lists](#ecmascript-regular-expression-support-for-allow-and-deny-lists)
* [Conflict resolution in label names](#conflict-resolution-in-label-names)
* [Kube-state-metrics self metrics](#kube-state-metrics-self-metrics)
* [Resource recommendation](#resource-recommendation)
Expand Down Expand Up @@ -129,6 +130,10 @@ e.g. by standardizing Kubernetes labels using an
[Admission Webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/)
that ensures that there are no possible conflicts.

#### ECMAScript regular expression support for allow and deny lists

Starting from [#2616](https://github.com/kubernetes/kube-state-metrics/pull/2616/files), kube-state-metrics supports ECMAScript's `regexp` for allow and deny lists. This was incorporated as a workaround for the limitations of the `regexp` package in Go, which does not support lookarounds due to their non-linear time complexity. Please note that while lookarounds are now supported for allow and deny lists, regular expressions' evaluation time is capped at a minute to prevent performance issues.

### Kube-state-metrics self metrics

kube-state-metrics exposes its own general process metrics under `--telemetry-host` and `--telemetry-port` (default 8081).
Expand Down
5 changes: 5 additions & 0 deletions README.md.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ are deleted they are no longer visible on the `/metrics` endpoint.
* [Resource group version compatibility](#resource-group-version-compatibility)
* [Container Image](#container-image)
* [Metrics Documentation](#metrics-documentation)
* [ECMAScript regular expression support for allow and deny lists](#ecmascript-regular-expression-support-for-allow-and-deny-lists)
* [Conflict resolution in label names](#conflict-resolution-in-label-names)
* [Kube-state-metrics self metrics](#kube-state-metrics-self-metrics)
* [Resource recommendation](#resource-recommendation)
Expand Down Expand Up @@ -130,6 +131,10 @@ e.g. by standardizing Kubernetes labels using an
[Admission Webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/)
that ensures that there are no possible conflicts.

#### ECMAScript regular expression support for allow and deny lists

Starting from [#2616](https://github.com/kubernetes/kube-state-metrics/pull/2616/files), kube-state-metrics supports ECMAScript's `regexp` for allow and deny lists. This was incorporated as a workaround for the limitations of the `regexp` package in Go, which does not support lookarounds due to their non-linear time complexity. Please note that while lookarounds are now supported for allow and deny lists, regular expressions' evaluation time is capped at a minute to prevent performance issues.

### Kube-state-metrics self metrics

kube-state-metrics exposes its own general process metrics under `--telemetry-host` and `--telemetry-port` (default 8081).
Expand Down
4 changes: 2 additions & 2 deletions docs/developer/cli-arguments.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,9 @@ Flags:
--log_file string If non-empty, use this log file (no effect when -logtostderr=true)
--log_file_max_size uint Defines the maximum size a log file can grow to (no effect when -logtostderr=true). Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800)
--logtostderr log to standard error instead of files (default true)
--metric-allowlist string Comma-separated list of metrics to be exposed. This list comprises of exact metric names and/or regex patterns. The allowlist and denylist are mutually exclusive.
--metric-allowlist string Comma-separated list of metrics to be exposed. This list comprises of exact metric names and/or *ECMAScript-based* regex patterns. The allowlist and denylist are mutually exclusive.
--metric-annotations-allowlist string Comma-separated list of Kubernetes annotations keys that will be used in the resource' labels metric. By default the annotations metrics are not exposed. To include them, provide a list of resource names in their plural form and Kubernetes annotation keys you would like to allow for them (Example: '=namespaces=[kubernetes.io/team,...],pods=[kubernetes.io/team],...)'. A single '*' can be provided per resource instead to allow any annotations, but that has severe performance implications (Example: '=pods=[*]').
--metric-denylist string Comma-separated list of metrics not to be enabled. This list comprises of exact metric names and/or regex patterns. The allowlist and denylist are mutually exclusive.
--metric-denylist string Comma-separated list of metrics not to be enabled. This list comprises of exact metric names and/or *ECMAScript-based* regex patterns. The allowlist and denylist are mutually exclusive.
--metric-labels-allowlist string Comma-separated list of additional Kubernetes label keys that will be used in the resource' labels metric. By default the labels metrics are not exposed. To include them, provide a list of resource names in their plural form and Kubernetes label keys you would like to allow for them (Example: '=namespaces=[k8s-label-1,k8s-label-n,...],pods=[app],...)'. A single '*' can be provided per resource instead to allow any labels, but that has severe performance implications (Example: '=pods=[*]'). Additionally, an asterisk (*) can be provided as a key, which will resolve to all resources, i.e., assuming '--resources=deployments,pods', '=*=[*]' will resolve to '=deployments=[*],pods=[*]'.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to extend this to the labels and annotations allowlists in a later PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep this exclusive to the allowdeny lists for now, and expand once we start to see positive results in the near future.

--metric-opt-in-list string Comma-separated list of metrics which are opt-in and not enabled by default. This is in addition to the metric allow- and denylists
--namespaces string Comma-separated list of namespaces to be enabled. Defaults to ""
Expand Down
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ go 1.24.0
require (
github.com/KimMachineGun/automemlimit v0.7.1
github.com/dgryski/go-jump v0.0.0-20211018200510-ba001c3ffce0
github.com/dlclark/regexp2 v1.11.5
github.com/fsnotify/fsnotify v1.8.0
github.com/go-logr/logr v1.4.2
github.com/gobuffalo/flect v1.0.3
Expand Down
2 changes: 2 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,8 @@ github.com/elazarl/goproxy v0.0.0-20230808193330-2592e75ae04a h1:mATvB/9r/3gvcej
github.com/elazarl/goproxy v0.0.0-20230808193330-2592e75ae04a/go.mod h1:Ro8st/ElPeALwNFlcTpWmkr6IoMFfkjXAvTHpevnDsM=
github.com/elliotchance/orderedmap/v2 v2.2.0 h1:7/2iwO98kYT4XkOjA9mBEIwvi4KpGB4cyHeOFOnj4Vk=
github.com/elliotchance/orderedmap/v2 v2.2.0/go.mod h1:85lZyVbpGaGvHvnKa7Qhx7zncAdBIBq6u56Hb1PRU5Q=
github.com/dlclark/regexp2 v1.11.5 h1:Q/sSnsKerHeCkc/jSTNq1oCm7KiVgUMZRDUoRu0JQZQ=
github.com/dlclark/regexp2 v1.11.5/go.mod h1:DHkYz0B9wPfa6wondMfaivmHpzrQ3v9q8cnmRbL6yW8=
github.com/emicklei/go-restful/v3 v3.11.0 h1:rAQeMHw1c7zTmncogyy8VvRZwtkmkZ4FxERmMY4rD+g=
github.com/emicklei/go-restful/v3 v3.11.0/go.mod h1:6n3XBCmQQb25CM2LCACGz8ukIrRry+4bhvbpWn3mrbc=
github.com/emicklei/proto v1.13.2 h1:z/etSFO3uyXeuEsVPzfl56WNgzcvIr42aQazXaQmFZY=
Expand Down
53 changes: 42 additions & 11 deletions pkg/allowdenylist/allowdenylist.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,24 @@ package allowdenylist

import (
"errors"
"regexp"
"strings"
"sync"
"time"

regexp "github.com/dlclark/regexp2"
"k8s.io/klog/v2"

generator "k8s.io/kube-state-metrics/v2/pkg/metric_generator"
)

// AllowDenyList encapsulates the logic needed to filter based on a string.
// Use ECMAScript as the default regexp spec to support lookarounds (#2594).
var (
once sync.Once
regexpDefaultSpec regexp.RegexOptions = regexp.ECMAScript
regexpDefaultTimeout = time.Minute
)

// AllowDenyList namespaceencapsulates the logic needed to filter based on a string.
type AllowDenyList struct {
list map[string]struct{}
rList []*regexp.Regexp
Expand All @@ -34,6 +45,9 @@ type AllowDenyList struct {
// New constructs a new AllowDenyList based on a allow- and a
// denylist. Only one of them can be not empty.
func New(allow, deny map[string]struct{}) (*AllowDenyList, error) {
once.Do(func() {
regexp.DefaultMatchTimeout = regexpDefaultTimeout
})
if len(allow) != 0 && len(deny) != 0 {
return nil, errors.New(
"allowlist and denylist are both set, they are mutually exclusive, only one of them can be set",
Expand Down Expand Up @@ -62,7 +76,7 @@ func New(allow, deny map[string]struct{}) (*AllowDenyList, error) {
func (l *AllowDenyList) Parse() error {
regexes := make([]*regexp.Regexp, 0, len(l.list))
for item := range l.list {
r, err := regexp.Compile(item)
r, err := regexp.Compile(item, regexpDefaultSpec)
if err != nil {
return err
}
Expand Down Expand Up @@ -99,25 +113,36 @@ func (l *AllowDenyList) Exclude(items []string) {
}

// IsIncluded returns if the given item is included.
func (l *AllowDenyList) IsIncluded(item string) bool {
var matched bool
func (l *AllowDenyList) IsIncluded(item string) (bool, error) {
var (
matched bool
err error
)
for _, r := range l.rList {
matched = r.MatchString(item)
matched, err = r.MatchString(item)
if err != nil {
return false, err
}
if matched {
break
}
}

if l.isAllowList {
return matched
return matched, nil
}

return !matched
return !matched, nil
}

// IsExcluded returns if the given item is excluded.
func (l *AllowDenyList) IsExcluded(item string) bool {
return !l.IsIncluded(item)
func (l *AllowDenyList) IsExcluded(item string) (bool, error) {
isIncluded, err := l.IsIncluded(item)
if err != nil {
return false, err
}

return !isIncluded, nil
}

// Status returns the status of the AllowDenyList that can e.g. be passed into
Expand All @@ -137,7 +162,13 @@ func (l *AllowDenyList) Status() string {

// Test returns if the given family generator passes (is included in) the AllowDenyList
func (l *AllowDenyList) Test(generator generator.FamilyGenerator) bool {
return l.IsIncluded(generator.Name)
isIncluded, err := l.IsIncluded(generator.Name)
if err != nil {
klog.ErrorS(err, "Error while processing allow-deny entries for generator", "generator", generator.Name)
return false
}

return isIncluded
}

func copyList(l map[string]struct{}) map[string]struct{} {
Expand Down
98 changes: 84 additions & 14 deletions pkg/allowdenylist/allowdenylist_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,12 @@ limitations under the License.
package allowdenylist

import (
"regexp"
"fmt"
"strings"
"testing"
"time"

regexp "github.com/dlclark/regexp2"
)

func TestNew(t *testing.T) {
Expand Down Expand Up @@ -76,7 +80,11 @@ func TestInclude(t *testing.T) {
t.Fatal("expected Parse() to not fail")
}

if !allowlist.IsIncluded("item1") {
isIncluded, err := allowlist.IsIncluded("item1")
if err != nil {
t.Fatal("expected IsIncluded() to not fail")
}
if !isIncluded {
t.Fatal("expected included item to be included")
}
})
Expand All @@ -93,7 +101,11 @@ func TestInclude(t *testing.T) {
t.Fatalf("expected Parse() to not fail, but got error : %v", err)
}

if !denylist.IsIncluded(item1) {
isIncluded, err := denylist.IsIncluded(item1)
if err != nil {
t.Fatal("expected IsIncluded() to not fail")
}
if !isIncluded {
t.Fatal("expected included item to be included")
}
})
Expand All @@ -103,13 +115,17 @@ func TestInclude(t *testing.T) {
t.Fatal("expected New() to not fail")
}

allowlist.Include([]string{"kube_.*_info"})
allowlist.Include([]string{"kube_(?=secret).*_info"})
err = allowlist.Parse()
if err != nil {
t.Fatalf("expected Parse() to not fail, but got error : %v", err)
}

if !allowlist.IsIncluded("kube_secret_info") {
isIncluded, err := allowlist.IsIncluded("kube_secret_info")
if err != nil {
t.Fatal("expected IsIncluded() to not fail")
}
if !isIncluded {
t.Fatal("expected included item to be included")
}
})
Expand All @@ -124,22 +140,38 @@ func TestInclude(t *testing.T) {
t.Fatal("expected New() to not fail")
}

denylist.Exclude([]string{"kube_node_.*_cores", "kube_pod_.*_bytes"})
denylist.Exclude([]string{"kube_(?=node.*cores|pod.*bytes)"})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it compatible with "denylist.Exclude([]string{"kube_node_.cores", "kube_pod._bytes"})"?

Lookaround feature is more complicated. I will prefer using the legacy one unless performance is much better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is. The library being introduced here is a superset of go's regexp one, and will only incur expensive perf operations for exponential matches, that go's library doesn't support since it has to guarantee linear-time operations.

Copy link
Member Author

@rexagod rexagod Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also these operations are capped at a minute, so no leaks.

err = denylist.Parse()
if err != nil {
t.Fatalf("expected Parse() to not fail, but got error : %v", err)
}

if denylist.IsExcluded(item1) {
isExcluded, err := denylist.IsExcluded(item1)
if err != nil {
t.Fatal("expected IsExcluded() to not fail")
}
if isExcluded {
t.Fatalf("expected included %s to be included", item1)
}
if denylist.IsIncluded(item2) {
isIncluded, err := denylist.IsIncluded(item2)
if err != nil {
t.Fatal("expected IsIncluded() to not fail")
}
if isIncluded {
t.Fatalf("expected included %s to be excluded", item2)
}
if denylist.IsIncluded(item3) {
isIncluded, err = denylist.IsIncluded(item3)
if err != nil {
t.Fatal("expected IsIncluded() to not fail")
}
if isIncluded {
t.Fatalf("expected included %s to be excluded", item3)
}
if denylist.IsExcluded(item4) {
isExcluded, err = denylist.IsExcluded(item4)
if err != nil {
t.Fatal("expected IsExcluded() to not fail")
}
if isExcluded {
t.Fatalf("expected included %s to be included", item4)
}
})
Expand All @@ -159,7 +191,11 @@ func TestExclude(t *testing.T) {
t.Fatalf("expected Parse() to not fail, but got error : %v", err)
}

if allowlist.IsIncluded(item1) {
isIncluded, err := allowlist.IsIncluded(item1)
if err != nil {
t.Fatal("expected IsIncluded() to not fail")
}
if isIncluded {
t.Fatal("expected excluded item to be excluded")
}
})
Expand All @@ -176,7 +212,11 @@ func TestExclude(t *testing.T) {
t.Fatalf("expected Parse() to not fail, but got error : %v", err)
}

if denylist.IsIncluded(item1) {
isIncluded, err := denylist.IsIncluded(item1)
if err != nil {
t.Fatal("expected IsIncluded() to not fail")
}
if isIncluded {
t.Fatal("expected excluded item to be excluded")
}
})
Expand Down Expand Up @@ -224,7 +264,8 @@ func TestStatus(t *testing.T) {
allowlist, _ := New(map[string]struct{}{item1: {}, item2: {}}, map[string]struct{}{})
actualStatusString := allowlist.Status()
expectedRegexPattern := `^Including the following lists that were on allowlist: (item1|item2), (item2|item1)$`
matched, _ := regexp.MatchString(expectedRegexPattern, actualStatusString)
re := regexp.MustCompile(expectedRegexPattern, regexpDefaultSpec)
matched, _ := re.MatchString(actualStatusString)
if !matched {
t.Errorf("expected status %q but got %q", expectedRegexPattern, actualStatusString)
}
Expand All @@ -244,9 +285,38 @@ func TestStatus(t *testing.T) {
denylist, _ := New(map[string]struct{}{}, map[string]struct{}{item1: {}, item2: {}})
actualStatusString := denylist.Status()
expectedRegexPattern := `^Excluding the following lists that were on denylist: (item1|item2), (item2|item1)$`
matched, _ := regexp.MatchString(expectedRegexPattern, actualStatusString)
re := regexp.MustCompile(expectedRegexPattern, regexpDefaultSpec)
matched, _ := re.MatchString(actualStatusString)
if !matched {
t.Errorf("expected status %q but got %q", expectedRegexPattern, actualStatusString)
}
})
}

func TestCatastrophicBacktrackTimeout(t *testing.T) {
r, err := regexp.Compile("(.+)*\\?", 0)
if err != nil {
t.Fatal(err)
}
var exp = "Lorem ipsum dolor sit amet, consectetur adipiscing elit"
exp = strings.Repeat(exp, 2^10)

timeout := regexpDefaultTimeout
t.Logf("regexp.DefaultMatchTimeout set to: %v", timeout)
buffer := 500 * time.Millisecond
t.Run(fmt.Sprint(timeout), func(t *testing.T) {
r.MatchTimeout = timeout
start := time.Now()
_, err = r.FindStringMatch(exp)
if err != nil && !strings.HasPrefix(err.Error(), "match timeout") {
t.Fatal(err)
}
if err == nil {
t.Fatal("expected catastrophic backtracking error")
}
elapsed := time.Since(start)
if elapsed > timeout+buffer {
t.Fatalf("timeout %v exceeded: %v", timeout, elapsed)
}
})
}
4 changes: 2 additions & 2 deletions pkg/options/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -162,8 +162,8 @@ func (o *Options) AddFlags(cmd *cobra.Command) {
o.cmd.Flags().StringVar((*string)(&o.Node), "node", "", "Name of the node that contains the kube-state-metrics pod. Most likely it should be passed via the downward API. This is used for daemonset sharding. Only available for resources (pod metrics) that support spec.nodeName fieldSelector. This is experimental.")
o.cmd.Flags().Var(&o.AnnotationsAllowList, "metric-annotations-allowlist", "Comma-separated list of Kubernetes annotations keys that will be used in the resource' labels metric. By default the annotations metrics are not exposed. To include them, provide a list of resource names in their plural form and Kubernetes annotation keys you would like to allow for them (Example: '=namespaces=[kubernetes.io/team,...],pods=[kubernetes.io/team],...)'. A single '*' can be provided per resource instead to allow any annotations, but that has severe performance implications (Example: '=pods=[*]').")
o.cmd.Flags().Var(&o.LabelsAllowList, "metric-labels-allowlist", "Comma-separated list of additional Kubernetes label keys that will be used in the resource' labels metric. By default the labels metrics are not exposed. To include them, provide a list of resource names in their plural form and Kubernetes label keys you would like to allow for them (Example: '=namespaces=[k8s-label-1,k8s-label-n,...],pods=[app],...)'. A single '*' can be provided per resource instead to allow any labels, but that has severe performance implications (Example: '=pods=[*]'). Additionally, an asterisk (*) can be provided as a key, which will resolve to all resources, i.e., assuming '--resources=deployments,pods', '=*=[*]' will resolve to '=deployments=[*],pods=[*]'.")
o.cmd.Flags().Var(&o.MetricAllowlist, "metric-allowlist", "Comma-separated list of metrics to be exposed. This list comprises of exact metric names and/or regex patterns. The allowlist and denylist are mutually exclusive.")
o.cmd.Flags().Var(&o.MetricDenylist, "metric-denylist", "Comma-separated list of metrics not to be enabled. This list comprises of exact metric names and/or regex patterns. The allowlist and denylist are mutually exclusive.")
o.cmd.Flags().Var(&o.MetricAllowlist, "metric-allowlist", "Comma-separated list of metrics to be exposed. This list comprises of exact metric names and/or *ECMAScript-based* regex patterns. The allowlist and denylist are mutually exclusive.")
o.cmd.Flags().Var(&o.MetricDenylist, "metric-denylist", "Comma-separated list of metrics not to be enabled. This list comprises of exact metric names and/or *ECMAScript-based* regex patterns. The allowlist and denylist are mutually exclusive.")
o.cmd.Flags().Var(&o.MetricOptInList, "metric-opt-in-list", "Comma-separated list of metrics which are opt-in and not enabled by default. This is in addition to the metric allow- and denylists")
o.cmd.Flags().Var(&o.Namespaces, "namespaces", fmt.Sprintf("Comma-separated list of namespaces to be enabled. Defaults to %q", &DefaultNamespaces))
o.cmd.Flags().Var(&o.NamespacesDenylist, "namespaces-denylist", "Comma-separated list of namespaces not to be enabled. If namespaces and namespaces-denylist are both set, only namespaces that are excluded in namespaces-denylist will be used.")
Expand Down