Skip to content

Improvements to Actuator Metrics #2949

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
joshlong opened this issue May 14, 2015 · 15 comments
Closed

Improvements to Actuator Metrics #2949

joshlong opened this issue May 14, 2015 · 15 comments

Comments

@joshlong
Copy link
Member

A few use cases have come up recently. Wanted to start a discussion, if nothing else:

@dsyer @spencergibb @philwebb

Could we ...

Also, what's the recommended way to aggregate all these stats / metrics? Is our advice to use a tool like Graphite / OpenTSDB / Statsd and then analyse / visualize things from there? Is it worth improving docs for that?

@wilkinsona
Copy link
Member

As chance would have it, Statsd and OpenTSDB support were merged yesterday. There's also some new aggregation stuff, along with the ability to export to JMX from where you can graph things using JConsole or JVisualVM. Check out the updated docs for more info.

@dsyer
Copy link
Member

dsyer commented May 14, 2015

We also already support Servo (in the sense that Servo metrics are
automatically exposed in /metrics). But you probably wouldn't want to use
Servo natively (it's old and singleton ridden).

HdrHistogram looks interesting. If we ever have our own histogram support
(as opposed to delegating to dropw) that would probably be the way to go.

@brenuart
Copy link
Contributor

A couple of questions on the subject:

(1) What are your plans to allow for the export of those PublicMetrics? I mean, they are not published in any MetricRepository so not available for export (so far). This is the case for any other metric that require to be polled for their value (pull) vs. those that push their values when an event occur.

(2) I'm also interested in your views on where/how you see the aggregation of metric values on a time interval to produce rates, averages, min, max or other derived metrics.
I'm actually looking at something like Servo's ideas behind its BasicTimer for instance. This monitor actually publishes 5 different metrics (totalTime, count, avg, min, max) for the last time interval. This approach provides a lower resolution than publishing metrics for every event - but seems an acceptable tradeoff in case of high event occurrence rate.

Where do you see such aggregation in your design? Should it be the responsibility of the MetricsRepository? The RichGauge seems to be a step in that direction... However, to be really useful, the RichGauge should be reset at some point (like switching to a new interval in Servo's approach). What component would be responsible for that? The repository? Should it happen only during an export?

@dsyer
Copy link
Member

dsyer commented May 19, 2015

What are your plans to allow for the export of those PublicMetrics

As of yesterday you can add a bean of type MetricsEndpointMetricReader (and a MetricWriter of your choice) to get all the public metrics exported. Please try it out and give us some feedback.

I'm also interested in your views on where/how you see the aggregation of metric values on a time interval to produce rates, averages, min, max or other derived metrics

I think "aggregation" is overloaded and possibly not very use in this context (if you say "aggregation" I think of combining metrics from multiple physical sources into a logical repository). Maybe "derived" metrics is more appropriate?

Some derived metrics can be accessed from global aggregators (e.g. graphite, openTSDB), so that might suffice for some use cases.

Rich gauges are an idea from Spring XD, and I think they find them pretty useful, but don't use our implementation. I would be interested to hear from anyone who wants to use them. They had feature parity with normal metrics in Boot 1.2 but probably do not yet in 1.3.

If you desperately need something, rather than Servo, I would recommend that you use Dropwizard for local derived metrics. We can certainly look into HdrHistogram as Josh suggested if this is a popular feature and people want native support.

@brenuart
Copy link
Contributor

As of yesterday you can add a bean of type MetricsEndpointMetricReader (and a MetricWriter of your choice) to get all the public metrics exported. Please try it out and give us some feedback.

Cool. I'll git it a try and keep you informed (on this issue or do you want me to send you feedback in a separate issue?)

Maybe "derived" metrics is more appropriate?
Let me take an example. I'd like to record metrics for the processing on incoming HTTP requests and send them to Graphite for graphing and later analysis/monitoring. At the end of the day, I'd like to "graph" the processing time and the request rate.

I could use the GaugeService and CounterService services provided by Spring Boot. They are both backed by a MetricWriter to which they forward each recorded value. Starting from Spring Boot 1.3.0 (not yet released), an InMemoryMetricRepository is used by default if DropWizard is not on the classpath. I can then periodically export metric values from there to my Graphite instance (probably using the export mechanisms you introduced recently).

This seems to be a good solution. However the InMemoryMetricRepository keeps only a single value (the latest) of every metric. Suppose two HTTP requests comes in, two executionTime values will be recorded and sent to the GaugeService - but only the latest will be available from the InMemoryMetricRepository.

Another option might be to hook the GaugeService to a different MetricWriter that would simply forward every metric directly to Graphite (similar to the MetricChannelMetricWriter or RedisMetricRepository). We could instead use a CompositeMetricWriter or any other solution to forward the recorded value to multiple destinations if, for instance, an InMemoryMetricRepository is needed in addition to the immediate export mechanism...
But then, I'm a bit concerned about the overhead of forwarding every single event to Graphite... Maybe it is not a problem... Any experience with such approach?

An intermediate solution is to somehow buffer the metrics for a short period of time (say 60s) and export a summary when the time interval is over. This is the approach followed by Netflix's Servo library (as far as I understood). Back to my incoming HTTP requests example, I would then record values for 60s, and once the period is over export multiple metrics:

  • totalTime: the sum of the time recorded during the interval
  • count: the number of sample during the interval
  • min/max: the min/max values during the interval
    The precision of the exported metrics is now 60s but it still allows me to graph and aggregate things on the backend. It has the added benefit less metric data being exported.

DropWizard could be used instead of Servo indeed. But it doesn't offer any mechanism to reset the metric after the 60s period interval...

What is your opinion on this use case?
How does it fit in Spring Boot's metric framework?
Don't hesitate to tell me that I'm addressing the problem in the wrong way... ;)

@dsyer
Copy link
Member

dsyer commented May 19, 2015

Your incoming HTTP requests example is exactly a RichGauge isn't it? And if you want richer information then Dropwizard has it in the Timer abstraction (histograms and stuff). I'm not sure what the "reset" feature is though, and I'm lazy, so can you explain what it does?

As far as mainlining to graphite goes, AFAIK there would be virtually no overhead (compared to an HTTP request), but doesn't graphite only store data periodically? So you don't get the rich guage behaviour out of the box. I think you do with OpenTSDB, but I might be wrong. However, buffering and exporting regularly seems like the right solution for everyone anyway. I just don't understand the need for a reset yet.

If you are already using graphite and are happy with it, and want to use rich gauges, I'd like to get you to try it out and make changes, or give us feedback where necessary.

Starting from Spring Boot 1.3.0 (not yet released), an InMemoryMetricRepository is used by default if DropWizard is not on the classpath

Not quite true. An InMemoryMetricRepository is used if Dropwizard is not on the classpath and Java8 is not available. If it is not used though, the intention is that the replacements are equivalent (but better). So you always have a MetricReader available for instance to do exports.

@chaimt
Copy link

chaimt commented Aug 19, 2015

any change of other options from DropWizard like timers and meters?

@dsyer
Copy link
Member

dsyer commented Aug 19, 2015

I guess, if someone wants to implement them. We still support Dropwizard metrics if not.

@iNikem
Copy link

iNikem commented Jan 28, 2017

Here is one more use case for "derived" metrics. I have a service which accepts some binary data from external world. I want to measure a consumed bandwidth of that service over time. So I imagine of recording the size of every data bit accepted, alongside with its timestamp. Sending this to, e.g. graphite, and then plotting sums over given time periods, like 1 minute.

It seems that Spring Boot currently does not support such usage?

@dsyer
Copy link
Member

dsyer commented Jan 28, 2017

What is there about that which isn't supported? Graphite does the time bucketing; all you have to do is send the raw data.

@iNikem
Copy link

iNikem commented Jan 29, 2017

As was mentioned before, GaugeService seems to send only the last value submitted to it, not all values. At least I cannot explain the data I see any other way :)

@dsyer
Copy link
Member

dsyer commented Jan 29, 2017

@iNikem that's more like a counter than a gauge then. Maybe if the existing export mechanism doesn't work for that use case we could find a way to export all the deltas instead of the sums.

@iNikem
Copy link

iNikem commented Jan 29, 2017

Reading the source, it seems that not using CounterService (which can increment by 1 only), but using CounterBuffers directly and wrapping it into BufferMetricReader will do the trick for me.

So main take away from my use-case: the documentation can be updated to mention a way to increment counters by arbitrary values :)

@wilkinsona
Copy link
Member

This has been superseded by the planned move to Micrometer-based metrics (#9970)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants