@@ -18,13 +18,7 @@ First of all, check the library support for
18
18
both currently only exists in the Go client library. Many libraries
19
19
support only one of the two types, or they support summaries only in a
20
20
limited fashion (lacking [ quantile
21
- calculation] ( #quantiles ) ). [ Contributions are welcome] ( /community/ ) ,
22
- of course. In general, we expect histograms to be more urgently needed
23
- than summaries. Histograms are also easier to implement in a client
24
- library, so we recommend to implement histograms first, if in
25
- doubt. The reason why some libraries offer summaries but not
26
- histograms (Ruby, the legacy Java client) is that histograms are a
27
- more recent feature of Prometheus.
21
+ calculation] ( #quantiles ) ).
28
22
29
23
## Count and sum of observations
30
24
@@ -35,20 +29,20 @@ durations or response sizes. They track the number of observations
35
29
(showing up in Prometheus as a time series with a ` _count ` suffix) is
36
30
inherently a counter (as described above, it only goes up). The sum of
37
31
observations (showing up as a time series with a ` _sum ` suffix)
38
- behaves like a counter, too, as long as all observations are
39
- positive . Obviously, request durations or response sizes are always
40
- positive . In principle, however, you can use summaries and histograms
41
- to observe negative values (e.g. temperatures in centigrade). In that
42
- case, the sum of observations can go down, so you cannot apply
43
- ` rate() ` to it anymore.
32
+ behaves like a counter, too, as long as there are no negative
33
+ observations . Obviously, request durations or response sizes are
34
+ never negative . In principle, however, you can use summaries and
35
+ histograms to observe negative values (e.g. temperatures in
36
+ centigrade). In that case, the sum of observations can go down, so you
37
+ cannot apply ` rate() ` to it anymore.
44
38
45
39
To calculate the average request duration during the last 5 minutes
46
40
from a histogram or summary called ` http_request_duration_second ` , use
47
41
the following expression:
48
42
49
- rate(http_request_duration_seconds_sum[5m])
50
- /
51
- rate(http_request_duration_seconds_count[5m])
43
+ rate(http_request_duration_seconds_sum[5m])
44
+ /
45
+ rate(http_request_duration_seconds_count[5m])
52
46
53
47
## Apdex score
54
48
@@ -64,9 +58,9 @@ requests served within 300ms and easily alert if the value drops below
64
58
served in the last 5 minutes. The request durations were collected with
65
59
a histogram called ` http_request_duration_seconds ` .
66
60
67
- sum(rate(http_request_duration_seconds_bucket{le="0.3"}[5m])) by (job)
68
- /
69
- sum(rate(http_request_duration_seconds_count[5m])) by (job)
61
+ sum(rate(http_request_duration_seconds_bucket{le="0.3"}[5m])) by (job)
62
+ /
63
+ sum(rate(http_request_duration_seconds_count[5m])) by (job)
70
64
71
65
72
66
You can calculate the well-known [ Apdex
@@ -75,13 +69,13 @@ a bucket with the target request duration as upper bound and another
75
69
bucket with the tolerated request duration (usually 4 times the target
76
70
request duration) as upper bound. Example: The target request duration
77
71
is 300ms. The tolerable request duration is 1.2s. The following
78
- expression yields the Apdex score over the last 5 minutes:
72
+ expression yields the Apdex score for each job over the last 5 minutes:
79
73
80
74
(
81
- rate(http_request_duration_seconds_bucket{le="0.3"}[5m])
82
- +
83
- rate(http_request_duration_seconds_bucket{le="1.2"}[5m])
84
- ) / 2 / rate(http_request_duration_seconds_count[5m])
75
+ sum( rate(http_request_duration_seconds_bucket{le="0.3"}[5m])) by (job )
76
+ +
77
+ sum( rate(http_request_duration_seconds_bucket{le="1.2"}[5m])) by (job )
78
+ ) / 2 / sum( rate(http_request_duration_seconds_count[5m])) by (job )
85
79
86
80
## Quantiles
87
81
@@ -213,8 +207,18 @@ Two rules of thumb:
213
207
214
208
1 . If you need to aggregate, choose histograms.
215
209
216
- 2 . Otherwise, choose a histogram if you need accuracy in the
217
- dimension of the observed values and you have an idea in which
218
- ranges of observed values you are interested in. Choose a summary
219
- if you need accuracy in the dimension of φ, no matter in which
220
- ranges of observed values the quantile will end up.
210
+ 2 . Otherwise, choose a histogram if you have an idea of the range
211
+ and distribution of values that will be observed. Choose a
212
+ summary if you need an accurate quantile, no matter what the
213
+ range and distribution of the values is.
214
+
215
+
216
+ ## What can I do if my client library does not support the metric type I need?
217
+
218
+ Implement it! [ Code contributions are welcome] ( /community/ ) . In
219
+ general, we expect histograms to be more urgently needed than
220
+ summaries. Histograms are also easier to implement in a client
221
+ library, so we recommend to implement histograms first, if in
222
+ doubt. The reason why some libraries offer summaries but not
223
+ histograms (Ruby, the legacy Java client) is that histograms are a
224
+ more recent feature of Prometheus.
0 commit comments