Skip to content

Commit 715b75b

Browse files
author
beorn7
committed
2nd round of code reviews after major rework.
1 parent 6efe3d3 commit 715b75b

File tree

1 file changed

+33
-29
lines changed

1 file changed

+33
-29
lines changed

content/docs/practices/histograms.md

Lines changed: 33 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -18,13 +18,7 @@ First of all, check the library support for
1818
both currently only exists in the Go client library. Many libraries
1919
support only one of the two types, or they support summaries only in a
2020
limited fashion (lacking [quantile
21-
calculation](#quantiles)). [Contributions are welcome](/community/),
22-
of course. In general, we expect histograms to be more urgently needed
23-
than summaries. Histograms are also easier to implement in a client
24-
library, so we recommend to implement histograms first, if in
25-
doubt. The reason why some libraries offer summaries but not
26-
histograms (Ruby, the legacy Java client) is that histograms are a
27-
more recent feature of Prometheus.
21+
calculation](#quantiles)).
2822

2923
## Count and sum of observations
3024

@@ -35,20 +29,20 @@ durations or response sizes. They track the number of observations
3529
(showing up in Prometheus as a time series with a `_count` suffix) is
3630
inherently a counter (as described above, it only goes up). The sum of
3731
observations (showing up as a time series with a `_sum` suffix)
38-
behaves like a counter, too, as long as all observations are
39-
positive. Obviously, request durations or response sizes are always
40-
positive. In principle, however, you can use summaries and histograms
41-
to observe negative values (e.g. temperatures in centigrade). In that
42-
case, the sum of observations can go down, so you cannot apply
43-
`rate()` to it anymore.
32+
behaves like a counter, too, as long as there are no negative
33+
observations. Obviously, request durations or response sizes are
34+
never negative. In principle, however, you can use summaries and
35+
histograms to observe negative values (e.g. temperatures in
36+
centigrade). In that case, the sum of observations can go down, so you
37+
cannot apply `rate()` to it anymore.
4438

4539
To calculate the average request duration during the last 5 minutes
4640
from a histogram or summary called `http_request_duration_second`, use
4741
the following expression:
4842

49-
rate(http_request_duration_seconds_sum[5m])
50-
/
51-
rate(http_request_duration_seconds_count[5m])
43+
rate(http_request_duration_seconds_sum[5m])
44+
/
45+
rate(http_request_duration_seconds_count[5m])
5246

5347
## Apdex score
5448

@@ -64,9 +58,9 @@ requests served within 300ms and easily alert if the value drops below
6458
served in the last 5 minutes. The request durations were collected with
6559
a histogram called `http_request_duration_seconds`.
6660

67-
sum(rate(http_request_duration_seconds_bucket{le="0.3"}[5m])) by (job)
68-
/
69-
sum(rate(http_request_duration_seconds_count[5m])) by (job)
61+
sum(rate(http_request_duration_seconds_bucket{le="0.3"}[5m])) by (job)
62+
/
63+
sum(rate(http_request_duration_seconds_count[5m])) by (job)
7064

7165

7266
You can calculate the well-known [Apdex
@@ -75,13 +69,13 @@ a bucket with the target request duration as upper bound and another
7569
bucket with the tolerated request duration (usually 4 times the target
7670
request duration) as upper bound. Example: The target request duration
7771
is 300ms. The tolerable request duration is 1.2s. The following
78-
expression yields the Apdex score over the last 5 minutes:
72+
expression yields the Apdex score for each job over the last 5 minutes:
7973

8074
(
81-
rate(http_request_duration_seconds_bucket{le="0.3"}[5m])
82-
+
83-
rate(http_request_duration_seconds_bucket{le="1.2"}[5m])
84-
) / 2 / rate(http_request_duration_seconds_count[5m])
75+
sum(rate(http_request_duration_seconds_bucket{le="0.3"}[5m])) by (job)
76+
+
77+
sum(rate(http_request_duration_seconds_bucket{le="1.2"}[5m])) by (job)
78+
) / 2 / sum(rate(http_request_duration_seconds_count[5m])) by (job)
8579

8680
## Quantiles
8781

@@ -213,8 +207,18 @@ Two rules of thumb:
213207

214208
1. If you need to aggregate, choose histograms.
215209

216-
2. Otherwise, choose a histogram if you need accuracy in the
217-
dimension of the observed values and you have an idea in which
218-
ranges of observed values you are interested in. Choose a summary
219-
if you need accuracy in the dimension of φ, no matter in which
220-
ranges of observed values the quantile will end up.
210+
2. Otherwise, choose a histogram if you have an idea of the range
211+
and distribution of values that will be observed. Choose a
212+
summary if you need an accurate quantile, no matter what the
213+
range and distribution of the values is.
214+
215+
216+
## What can I do if my client library does not support the metric type I need?
217+
218+
Implement it! [Code contributions are welcome](/community/). In
219+
general, we expect histograms to be more urgently needed than
220+
summaries. Histograms are also easier to implement in a client
221+
library, so we recommend to implement histograms first, if in
222+
doubt. The reason why some libraries offer summaries but not
223+
histograms (Ruby, the legacy Java client) is that histograms are a
224+
more recent feature of Prometheus.

0 commit comments

Comments
 (0)