Skip to content

Add predicate to MetricReader and MetricProducer #3566

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Dec 12, 2023
Merged
Changes from 8 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
a4a1eb5
First version
asafm Jun 25, 2023
ccb52f2
remove whitespace
asafm Jun 25, 2023
87a9197
remove added comma
asafm Jun 25, 2023
9767e39
Remove instrument description from predicate parameters as even the v…
asafm Jun 28, 2023
8a30be4
Marking predicate as experimental
asafm Jun 28, 2023
552167e
Changing the proposal to be more specific by specifying the predicate…
asafm Jul 17, 2023
98e1557
Update specification/metrics/sdk.md
asafm Jul 25, 2023
ccab0a9
Added experimental to FilterRules section
asafm Jul 25, 2023
5b9d5ae
Revert "Added experimental to FilterRules section"
asafm Oct 22, 2023
3e40a6c
Revert "Update specification/metrics/sdk.md"
asafm Oct 22, 2023
8837585
Revert "Changing the proposal to be more specific by specifying the p…
asafm Oct 22, 2023
5bcb595
Replaced the single function with an interface of two functions for p…
asafm Oct 22, 2023
236bd7c
Small fixes
asafm Oct 22, 2023
fd84663
Merge remote-tracking branch 'upstream/main' into add-predicate
asafm Oct 22, 2023
7d5cf0c
PR fixes
asafm Oct 25, 2023
d7a0138
Merge remote-tracking branch 'upstream/main' into add-predicate
asafm Oct 25, 2023
dedc53d
MetricFilter is now a parameter to MetricProducer's Produce operation
asafm Nov 9, 2023
b430e0c
Merge branch 'main' into add-predicate
asafm Nov 9, 2023
a82ffdd
Changed all naming in the interface of MetricFilter
asafm Nov 13, 2023
eb76eee
Remove params from headers
asafm Nov 13, 2023
ba242a0
Changed Reject to drop
asafm Nov 14, 2023
1652efe
Removed too much details
asafm Nov 14, 2023
7bb73b7
Updated TOC
asafm Nov 14, 2023
ad9efd3
Merge branch 'main' into add-predicate
asafm Nov 15, 2023
ee2880b
Fixed kind
asafm Nov 16, 2023
3954c9a
Merge branch 'main' into add-predicate
asafm Nov 19, 2023
70be5ca
lint fixes
asafm Nov 22, 2023
04be7c3
Merge branch 'main' into add-predicate
asafm Nov 22, 2023
e587f7e
Add to CHANGELOG.md and spec-compliance-matrix.md
asafm Nov 23, 2023
d4166bd
fixed CHANGELOG.md
asafm Nov 23, 2023
413763a
Merge branch 'main' into add-predicate
asafm Dec 5, 2023
271398e
Merge branch 'main' into add-predicate
asafm Dec 11, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
199 changes: 199 additions & 0 deletions specification/metrics/sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -919,6 +919,8 @@ determines the following capabilities:
* Registering [MetricProducer](#metricproducer)(s)
* Collecting metrics from the SDK and any registered
[MetricProducers](#metricproducer) on demand.
* Setting [FilterRules](#filterrules), determining what metrics will be collected via filter rules
(**Status**: [Experimental](../document-status.md)).
* Handling the [ForceFlush](#forceflush) and [Shutdown](#shutdown) signals from
the SDK.

Expand All @@ -930,6 +932,13 @@ SHOULD provide at least the following:
* The default output `temporality` (optional), a function of instrument kind. If not configured, the Cumulative temporality SHOULD be used.
* The default aggregation cardinality limit to use, a function of instrument kind. If not configured, a default value of 2000 SHOULD be used.

A `MetricReader` SHOULD allow setting [FilterRules](#filterrules), used to filter data points returned by `Collect`
operation.
A `MetricReader` SHOULD allow setting the [FilterRules](#filterrules) more than once. After
[FilterRules](#filterrules) are set, they will be used in subsequent `Collect` operations.
A `MetricReader` SHOULD provide the implementation of [FilterRules](#filterrules) to the SDK's or registered
[MetricProducer](#metricproducer)(s), as [MetricFilter](#MetricFilter) interface.

The [MetricReader.Collect](#collect) method allows general-purpose
`MetricExporter` instances to explicitly initiate collection, commonly
used with pull-based metrics collection. A common sub-class of
Expand Down Expand Up @@ -999,6 +1008,16 @@ If the [MeterProvider](#meterprovider) is an instance of
MeterProvider, but MUST NOT allow multiple [MeterProviders](#meterprovider)
to be registered with the same MetricReader.

#### SetFilterRules(filterRules)

**Status**: [Experimental](../document-status.md)

SetFilterRules allows setting the [FilterRules](#filterrules), which dictates which metric
data points should be returned in the `Collect` operation.

This operation can be called multiple times, allowing changing the rules in runtime. The set rules
will be used in subsequent `Collect` operations.

#### Collect

Collects the metrics from the SDK and any registered
Expand Down Expand Up @@ -1036,6 +1055,171 @@ implemented as a blocking API or an asynchronous API which notifies the caller
via a callback or an event. [OpenTelemetry SDK](../overview.md#sdk) authors MAY
decide if they want to make the shutdown timeout configurable.

### FilterRules

(**Status**: [Experimental](../document-status.md))

`FilterRules` provides SDK users the ability to specify which instruments and attributes
should be filtered out when [MetricProducer](#metricproducer)(s) (be it the SDK's or
registered ones) traverse instruments and their attributes during its `Produce` operation.

A typical use case is when an application records metrics at detailed granularity (e.g. high cardinality),
and exports only a portion of them when needed, for certain period of time. The
ability to call `SetFilterRules` operation on a `MetricReader` multiple times, in runtime, is what
enables toggling the export of those detailed granularity metrics.

`FilterRules` are composed of the following elements:
- `rules` : A list of `FilterRule`(s). Order is meaningful here, as explained below.

`FilterRule` is composed of the following elements:
- An optional rule `name`
- An instrument selection criteria composed of:
- The `type` of the instrument(s) (optional)
- The `name` of the Meter (optional)
- The `unit` of the instrument(s) (optional)
- The `name` of the instrument(s). The SDK MUST support wildcard character asterisk (`*`)
matching zero or more characters, and MAY support the question mark (`?`) matching
exactly one character
- Notes:
- The criteria should be treated as additive, which means the instrument has to meet
_all_ the provided criteria. For example, if the criteria is
_instrument name == "Foobar"_ and _instrument type is Histogram_, it will be treated
as _(instrument name == "Foobar") AND (instrument type is Histogram)_
- If no criteria is provided, the SDK SHOULD treat it as an error. It is
recommended that the SDK implementations fail fast. Please refer to [Error
handling in OpenTelemetry](../error-handling.md) for the general guidance.
- An optional attributes selection criteria, which is a set of single attribute
selection criteria defined as:
- The `name` of the attribute
- A `condition` on the attribute's value. An [Attribute](../common/README.md#attribute)'s
value can be a primitive type or an array of homogeneous primitive types, and as such
for each type there are different conditions:
- A `wildcard_match` condition, for type string. The SDK MUST support wildcard character asterisk (`*`)
matching zero or more characters, and MAY support the question mark (`?`) matching
exactly one character. This condition can be applied to any primitive type, as each type can be
transformed into a string.
- `greater_than` condition, for number types: double floating point or signed 64 bit integer.
- `greater_or_equal_to` condition for number types: double floating point or signed 64 bit integer.
- `less_than` condition, for number types: double floating point or signed 64 bit integer.
- `less_than_or_equal_to` condition for number types: double floating point or signed 64 bit integer.
- `is` condition for boolean type.
- Notes:
- The SDK MUST support the `wildcard_match` condition.
- If an attribute value is an array of values, a single value match satisfies the condition. For
example, if the value is `[NY, CA, TX]` and the condition is _country = "NY"_, then the
condition evaluates to true since the value `NY` satisfies the condition.
- If the condition can not be applied to the attribute value type, it is
treated as no match (false evaluation). For example, if the condition is `is: true` but the
attribute value is a number, it will be considered a no match.
- Notes:
- The criteria should be considered additive, which means, the attribute set has to
meet _all_ the provided single attribute criteria. For example, if the criteria is
_status >= 400_ and _host == "study*.io"_, it will be treated as
_(status >= 400) AND (host == "study*.io")_.
- If no attributes selection criteria was supplied, it means all attribute sets for that instrument are selected
for that rule.
- A `filter` decision enumeration which can either be `drop`, which means that all selected (instrument, attributes)
pairs are skipped during `MetricProducer` iterative `Produce` operation,, or `keep` which means all
(instrument, attributes) pairs are included and will appear in the `Produce` operation result.

The order of the `FilterRule`(s) in `rules` has meaning, as it allows to set default filtering decision on a wide
criteria selection, and override it for a more narrow selection, by adding another rule _after_ the "wide" rule.

Here's an example:

This example is for a distributed relational database, which has namespaces, each containing tables. A table
can have 30 different instruments: write latency, written record size, read latency, etc. Namespaces can be in
hundreds and tables can be in thousands, therefor, there are namespace granularity instruments (`namespace_*`) and
table granularity instruments (`table_*`). For example: `namespace_write_size_bytes` and
`table_write_size_bytes`. Exporting all instruments across all namespaces and tables can be quite expensive,
hence you can use the `FilterRule`(s) to control it:

```
{
rules: [
{
name: "Drop all table-level instruments by default",
instrument_select: {
name: "table_*"
},
filter: "drop"
},

// For a certain namespace (billing), you want to export `write_size_bytes`
{
name: "Expose write size bytes for billing",
instrument_select: {
name: "table_write_size_bytes"
},
attribute_select: {
"namespace": {
wildcard_match: "billing"
}
},
filter: "keep"
},

// For orders table in ecommerce namespace, we need all instruments
{
name: "All instruments for orders table",
instrument_select: {
name: "table_*"
},
attributes_select: {
"namespace": {
wildcard_match: "ecommerce"
},
"table": {
wildcard_match: "orders"
},
filter: "keep"
}
]
}
```


### `MetricFilter`

**Status**: [Experimental](../document-status.md)

A `MetricFilter` is an interface, offering an efficient representation of [FilterRules](#filterrules)
implementation.

It allows a `MetricProducer` to make a decision whether an (instrument, attributes) pair should be
rejected (filtered out) or allowed. Some instruments may have all of their attribute sets completely
rejected or allowed, allowing a `MetricProducer` to "short circuit" and completely skip an instrument
and all of its attribute sets.

#### MetricFilter operations

##### FilterInstrument

For a given instrument, returns a filter decision, that can be one of the following:
* Allow All Attributes - All attribute sets for the given instruments are allowed.
* Reject All Attributes - All attribute sets for the given instruments are rejected. This allows the
`MetricProducer` to skip this instrument while collecting data points.
* Allow Some Attributes - Some attribute sets may be allowed. This means each attribute set must
be checked if allowed or rejected using the `AllowInstrumentAttributes` operation.

The instrument properties are passed as arguments to the function:
* Meter name
* Instrument name
* Instrument type
* Instrument unit

##### AllowInstrumentAttributes

For a given (instrument, attributes), returns a boolean filtering decision, in which true means allowed and
false means rejected (filtered out).

The instrument properties are passed as arguments to the function:
* Meter name
* Instrument name
* Instrument type
* Instrument unit


### Periodic exporting MetricReader

This is an implementation of the `MetricReader` which collects metrics based on
Expand Down Expand Up @@ -1287,6 +1471,19 @@ in-memory state MAY implement the `MetricProducer` interface for convenience.
`AggregationTemporality` of produced metrics. SDK authors MAY provide utility
libraries to facilitate conversion between delta and cumulative temporalities.

----------
**Status**: [Experimental](../document-status.md)

`MetricProducer` implementations SHOULD allow providing it with a [MetricFilter](#MetricFilter). It should
be used to decide, during `Produce` operation, whether to skip an instrument (including all of its attribute sets),
include the instrument entirely (all of its attribute sets) or allow only some attribute sets for an instrument,
by checking if each attribute set should be filtered.

A `MetricProducer` SHOULD allow changing the [MetricFilter](#MetricFilter), which will be used in
subsequent `Produce` operations.

-------

If the batch of [Metric points](./data-model.md#metric-points) returned by
`Produce()` includes a [Resource](../resource/sdk.md), the `MetricProducer` MUST
accept configuration for the [Resource](../resource/sdk.md).
Expand All @@ -1313,6 +1510,8 @@ A `MetricProducer` MUST support the following functions:

`Produce` provides metrics from the MetricProducer to the caller. `Produce`
MUST return a batch of [Metric points](./data-model.md#metric-points).
Implementation SHOULD use the predicate to filter (instrument, attributes) pairs
which the predicate did not allow.
`Produce` does not have any required parameters, however, [OpenTelemetry
SDK](../overview.md#sdk) authors MAY choose to add parameters (e.g. timeout).

Expand Down