-
Notifications
You must be signed in to change notification settings - Fork 810
OM text exposition for NH #1087
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Arianna Vespri <[email protected]>
Signed-off-by: Arianna Vespri <[email protected]>
Signed-off-by: Arianna Vespri <[email protected]>
Signed-off-by: Arianna Vespri <[email protected]>
…histograms cohabitation Signed-off-by: Arianna Vespri <[email protected]>
Signed-off-by: Arianna Vespri <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work, I have some suggestions to improve type checking and also the extra spaces need removal
@@ -30,24 +30,24 @@ def generate_latest(registry): | |||
escape_metric_name(mname), _escape(metric.documentation))) | |||
output.append(f'# TYPE {escape_metric_name(mname)} {metric.type}\n') | |||
if metric.unit: | |||
output.append(f'# UNIT {escape_metric_name(mname)} {metric.unit}\n') | |||
output.append(f'# UNIT {escape_metric_name(mname)} {metric.unit}\n') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove extra spaces from end of lines. Also reduces the diff.
if s.labels: | ||
items = sorted(s.labels.items()) | ||
items = s.labels.items() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OM does not require ordering of labels, still I wonder why change this, doesn't seem part of the scope of the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just wanted to use the exact same strings I used in the parser tests (there the labels are not sorted). Since I knew that ordering is not required, I removed the sorting, and once I saw that the pre-existing exposition tests don’t fail, I simply thought I’d kill two birds with one stone, i.e. keeping the same test strings and removing the sorting performance overhead (which I admit I haven’t quantified btw). Having said that, I realize this change is out of the scope of this PR, so of course I will revert it, if you wish.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep the sorting. I'd have to go and make sure that it doesn't cause an issue anywhere not tested. Probably ok, but for example the sorting is important for multiprocess as that is how we match keys.
timestamp, | ||
exemplarstr, | ||
native_histogram |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this must be placed immediately after value,
because the timestamp and exemplars come last
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd actually say we should just store native histogram into value
and keep the format string intact. There is only one value field allowed in a line so having two fields feels wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Storing native histograms in value
was my first approach in the initial PR for the parser. That approach was discarded because, for value
to support both float
and nativeHistogram
types, changes would have been necessary also in registry.py
and in bridge/graphite.py
. Those changes were deemed undesirable, thus the decision of adding native_histogram
as a separate attribute of the Sample
class. It was added for last (after exemplars
and timestamp
) as I thought that would have made it less likely to break things downstream rather than inserting it before. Please find the original conversation here: #1040 . While it’s my bad for not having noticed that supporting exemplars had turned from a non-goal to a goal by the time I had already started working on the parser, I knew I should have dealt with that at some point, but I thought to do something along the lines of client_golang, where the exemplars for native histograms (if I understand the code correctly) are represented separately from the “classic” exemplars https://github.com/prometheus/client_golang/blob/e729ba11961abb3a50910c324221d06a35bdce4f/prometheus/histogram.go#L1678 by means of a nativeExemplar
type.
I’ll be happy to open a new PR on the parser first in order to support exemplars and address your change requests, but I can’t do it without reaching a consensus on how class Sample
should look like. The way I see it, there are at least 3 possibilities:
-
Class
Sample
stays as it is, while I add an attribute (say,native_exemplars
of someNativeExemplar
type) to classNativeHistogram
. -
I revert class
Sample
to have thevalue
attribute accepting bothfloat
andNativeHistogram
types, (but then I wouldn’t know how to avoid changingregistry.py
andbridge/graphite.py
). -
I move up the
native_histogram
attribute, so that classSample
becomes:
class Sample(NamedTuple):
name: str
labels: Dict[str, str]
value: float
native_histogram: Optional[NativeHistogram] = None
timestamp: Optional[Union[float, Timestamp]] = None
exemplar: Optional[Exemplar] = None
and change things accordingly elsewhere.
Which one shall I opt for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Go we opted for separate sample types for floats , and native histograms (actually there's two native histogram sample types in Go, but I digress) due to performance issues negatively affecting the float sample values.
I don't know how much of a concern it is in Python, I suspect it's not a big issue.
So my vote is:
let's finish this PR with option 1. My comment about the order of printing is still valid.
Let's try to implement option 2 in a separate PR . I wish we could unify the types in Go as well...
I don't understand option 3, my comments are about the printing order , not the order in the tuple which I don't care about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't thinking about changing types elsewhere, just in this small bit of code when doing the formatting. E.g. just above this line do:
value = ''
if s.native_histogram:
value = native_histogram
elif s.value is not None
value = floatToGoString(s.value)
timestamp, | ||
exemplarstr, | ||
native_histogram |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this must be placed immediately after value,
because the timestamp and exemplars come last
@@ -94,6 +95,125 @@ def test_histogram(self): | |||
# EOF | |||
""", generate_latest(self.registry)) | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please also add test for exemplars (note that native histograms may have multiple exemplars) see example in https://github.com/prometheus/proposals/blob/main/proposals/2024-01-29_native_histograms_text_format.md
# Signal presence for pos/neg spans/deltas | ||
pos = False | ||
neg = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: these just shadow s.native_histogram.pos_spans
and s.native_histogram.neg_spans
so I'd just write that, but no strong opinion
native_histogram = nh_sample_template.format(*args) | ||
|
||
value = '' | ||
if s.value is not None or not s.native_histogram: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be enough to write
if s.value is not None or not s.native_histogram: | |
if not s.native_histogram: |
At least that's what we do in Prometheus. Type checks should prevent having None in the value.
|
||
def test_native_histogram(self): | ||
hfm = HistogramMetricFamily("nh", "nh") | ||
hfm.add_sample("nh", None, None, None, None, NativeHistogram(24, 100, 0, 0.001, 4, (BucketSpan(0, 2), BucketSpan(1, 2)), (BucketSpan(0, 2), BucketSpan(1, 2)), (2, 1, -3, 3), (2, 1, -2, 3))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type annotation of add_sample says labels: Dict[str, str]
, which means that None
is not allowed. I think adding {}
should work. Same comment on the rest of the tests.
The type annotation of add_sample says value: float
, which means that None
is not allowed. Also valid comment for the other tests here.
hfm.add_sample("nh", None, None, None, None, NativeHistogram(24, 100, 0, 0.001, 4, (BucketSpan(0, 2), BucketSpan(1, 2)), (BucketSpan(0, 2), BucketSpan(1, 2)), (2, 1, -3, 3), (2, 1, -2, 3))) | |
hfm.add_sample("nh", None, 0, None, None, NativeHistogram(24, 100, 0, 0.001, 4, (BucketSpan(0, 2), BucketSpan(1, 2)), (BucketSpan(0, 2), BucketSpan(1, 2)), (2, 1, -3, 3), (2, 1, -2, 3))) |
The reason why mypy
CI check doesn't catch these is due to not having type annotation on the def
line.
@@ -94,6 +95,125 @@ def test_histogram(self): | |||
# EOF | |||
""", generate_latest(self.registry)) | |||
|
|||
|
|||
def test_native_histogram(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def test_native_histogram(self): | |
def test_native_histogram(self) -> None: |
To get type checks. Also for all the other tests here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple more comments to add to krajorama's.
# Append pos deltas if pos spans were added | ||
if pos: | ||
nh_sample_template += ',positive_deltas:[{}]' | ||
args.append(positive_deltas) | ||
|
||
# Append neg deltas if neg spans were added | ||
if neg: | ||
nh_sample_template += ',negative_deltas:[{}]' | ||
args.append(negative_deltas) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't these just be included in the already present if statements above and then there would be no need for the pos
and neg
variables?
timestamp, | ||
exemplarstr, | ||
native_histogram |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd actually say we should just store native histogram into value
and keep the format string intact. There is only one value field allowed in a line so having two fields feels wrong.
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@csmarchbanks and @krajorama , thanks a lot for your reviews! Before I proceed addressing your change requests, let me know how you think class |
For the sake of separation of concerns, this PR focuses only on the OM exposition format for native histograms and it does so by mocking native histograms in tests i.e. it uses the helper function
custom_collector
(as it is already done for gauge histograms) instead of theobserve
function. I will carry out the implementation of the "complete" support for native histograms in client_python in another iteration (I'm already on it).Incidentally, in the exposition logic, the
#HELP
line gets appended before the#TYPE
line, whereas in the parser logic is the other way around; also, in the exposition logic the value is always a float in the OM format, whereas in the parser it doesn't seem to be the norm. In this PR I follow the behaviour already existing in the exposition logic. Let me know if I should change anything in this regard.P.S. Of course this way of arranging the tests is just temporary; in the next iteration, they will be modified to use
observe
accordingly.Edit: this PR addresses part of the issue prometheus/OpenMetrics#279