Skip to content

Upgrade Prometheus and Opentelemetry dependencies#6422

Merged
fab-10 merged 9 commits intobesu-eth:mainfrom
fab-10:upgrade-prometheus
Jan 30, 2024
Merged

Upgrade Prometheus and Opentelemetry dependencies#6422
fab-10 merged 9 commits intobesu-eth:mainfrom
fab-10:upgrade-prometheus

Conversation

@fab-10
Copy link
Copy Markdown
Contributor

@fab-10 fab-10 commented Jan 17, 2024

PR description

Upgrade Prometheus and Opentelemetry dependencies, and adjust the code accordingly.

Prometheus simpleclient upgrade introduces some changed, to follow the OpenMetrics specification, like adding the _total suffix to every counters that haven't it, and adding a _created suffixed metric to some types, while the latter is not an issue, the _total suffix is a breaking change, and requires an updated version of our official Grafana dashboard, that has been already published here, and is compatible with both metrics names.
Using the metrics by other means require a change to the tooling, for this a breaking change section has been added.

Fixed Issue(s)

@github-actions
Copy link
Copy Markdown

github-actions bot commented Jan 17, 2024

  • I thought about documentation and added the doc-change-required label to this PR if updates are required.
  • I thought about the changelog and included a changelog update if required.
  • If my PR includes database changes (e.g. KeyValueSegmentIdentifier) I have thought about compatibility and performed forwards and backwards compatibility tests

@fab-10 fab-10 force-pushed the upgrade-prometheus branch 3 times, most recently from 365f5e9 to a818871 Compare January 19, 2024 14:40
@fab-10 fab-10 changed the title Upgrade prometheus Upgrade Prometheus and Opentelemetry dependencies Jan 19, 2024
@fab-10 fab-10 force-pushed the upgrade-prometheus branch from fd2aab2 to 66bdc6b Compare January 19, 2024 15:46
@fab-10 fab-10 marked this pull request as ready for review January 19, 2024 15:47
Copy link
Copy Markdown
Contributor

@pinges pinges left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 Question and minor improvement suggestions

final String unPrefixedName =
metricName.startsWith(prefix) ? metricName.substring(prefix.length()) : metricName;
return totalSuffixedCounters.contains(unPrefixedName + "_total")
? unPrefixedName + "_total"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this not end up with two "_total" suffixes?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this could be a little misleading, but internally Prometheus stores the counter name without the _total suffix, and if a name already has the suffix, it is removed, so when converting back from Prometheus to Besu names, if we remember that the name already had the suffix, we need to readd it.

@fab-10 fab-10 force-pushed the upgrade-prometheus branch from e8470b9 to 3af8312 Compare January 22, 2024 10:25
fab-10 and others added 8 commits January 23, 2024 09:28
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
[skip-ci]

Co-authored-by: Stefan Pingel <16143240+pinges@users.noreply.github.com>
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
@fab-10 fab-10 force-pushed the upgrade-prometheus branch from 3af8312 to 8936877 Compare January 23, 2024 08:28
@fab-10 fab-10 requested a review from pinges January 23, 2024 09:33
@fab-10 fab-10 enabled auto-merge (squash) January 23, 2024 09:33
Copy link
Copy Markdown
Contributor

@pinges pinges left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

## 24.1.2-SNAPSHOT

### Breaking Changes
- Following the OpenMetrics convention, the updated Prometheus client adds the `_total` suffix to every metrics of type counter, with the effect that some existing metrics have been renamed to have this suffix. If you are using the official Besu Grafana dashboard [(available here)](https://grafana.com/grafana/dashboards/16455-besu-full/), just update it to the latest revision, that accepts the old and the new name of the affected metrics. If you have a custom dashboards or use the metrics in other ways, then you need to manually update it to support the new naming.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Following the OpenMetrics convention, the updated Prometheus client adds the `_total` suffix to every metrics of type counter, with the effect that some existing metrics have been renamed to have this suffix. If you are using the official Besu Grafana dashboard [(available here)](https://grafana.com/grafana/dashboards/16455-besu-full/), just update it to the latest revision, that accepts the old and the new name of the affected metrics. If you have a custom dashboards or use the metrics in other ways, then you need to manually update it to support the new naming.
- Following the OpenMetrics convention, the updated Prometheus client adds the `_total` suffix to every metrics of type counter, with the effect that some existing metrics have been renamed to have this suffix. If you are using the official Besu Grafana dashboard [(available here)](https://grafana.com/grafana/dashboards/16455-besu-full/), just update it to the latest revision, that accepts the old and the new name of the affected metrics. If you have a custom dashboard or use the metrics in other ways, then you need to manually update it to support the new naming.

CHANGELOG.md Outdated
- `--Xfilter-on-enr-fork-id` has been removed. To disable the feature use `--filter-on-enr-fork-id=false`.
- The time that can be spent selecting transactions during block creation is not capped at 5 seconds for PoS and PoW networks, and for PoA networks, at 75% of the block period specified in the genesis, this to prevent possible DoS in case a single transaction is taking too long to execute, and to have a stable block production rate, but it could be a breaking change if an existing network used to have transactions that takes more time to executed that the newly introduced limit, if it is mandatory for these network to keep processing these long processing transaction, then the default value of `block-txs-selection-max-time` or `poa-block-txs-selection-max-time` needs to be tuned accordingly.
- `--Xfilter-on-enr-fork-id` has been removed. To disable the feature use `--filter-on-enr-fork-id=false`.
- The time that can be spent selecting transactions during block creation is not capped at 5 seconds for PoS and PoW networks, and for PoA networks, at 75% of the block period specified in the genesis. This is to prevent possible DoS attacks in case a single transaction is taking too long to execute, and to have a stable block production rate. This could be a breaking change if an existing network needs to accept transactions that take more time to executed than the newly introduced limit. If it is mandatory for these networks to keep processing these long processing transaction, then the default value of `block-txs-selection-max-time` or `poa-block-txs-selection-max-time` needs to be tuned accordingly.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- The time that can be spent selecting transactions during block creation is not capped at 5 seconds for PoS and PoW networks, and for PoA networks, at 75% of the block period specified in the genesis. This is to prevent possible DoS attacks in case a single transaction is taking too long to execute, and to have a stable block production rate. This could be a breaking change if an existing network needs to accept transactions that take more time to executed than the newly introduced limit. If it is mandatory for these networks to keep processing these long processing transaction, then the default value of `block-txs-selection-max-time` or `poa-block-txs-selection-max-time` needs to be tuned accordingly.
- The time that can be spent selecting transactions during block creation is not capped at 5 seconds for PoS and PoW networks, and for PoA networks, at 75% of the block period specified in the genesis. This is to prevent possible DoS attacks in case a single transaction is taking too long to execute, and to have a stable block production rate. This could be a breaking change if an existing network needs to accept transactions that take more time to execute than the newly introduced limit. If it is mandatory for these networks to keep processing these long processing transaction, then the default value of `block-txs-selection-max-time` or `poa-block-txs-selection-max-time` needs to be tuned accordingly.

@macfarla
Copy link
Copy Markdown
Contributor

macfarla commented Jan 29, 2024

FYI @non-fungible-nelson these debug level logs still occurring even with this upgrade ->

dev-elc-besu-teku-goerli-sally-6422-uprev-otel:~$ grep B3PropagatorExtractorMultipleHeaders /var/log/besu/besu.log
{"@timestamp":"2024-01-29T04:24:26,403","level":"DEBUG","thread":"vert.x-eventloop-thread-1","class":"B3PropagatorExtractorMultipleHeaders","message":"Invalid TraceId in B3 header: null'. Returning INVALID span context.","throwable":""}
{"@timestamp":"2024-01-29T04:24:26,404","level":"DEBUG","thread":"vert.x-eventloop-thread-1","class":"B3PropagatorExtractorMultipleHeaders","message":"Invalid TraceId in B3 header: null'. Returning INVALID span context.","throwable":""}
{"@timestamp":"2024-01-29T04:24:26,847","level":"DEBUG","thread":"vert.x-eventloop-thread-1","class":"B3PropagatorExtractorMultipleHeaders","message":"Invalid TraceId in B3 header: null'. Returning INVALID span context.","throwable":""}
{"@timestamp":"2024-01-29T04:24:26,862","level":"DEBUG","thread":"vert.x-eventloop-thread-1","class":"B3PropagatorExtractorMultipleHeaders","message":"Invalid TraceId in B3 header: null'. Returning INVALID span context.","throwable":""}
{"@timestamp":"2024-01-29T04:24:27,286","level":"DEBUG","thread":"vert.x-eventloop-thread-1","class":"B3PropagatorExtractorMultipleHeaders","message":"Invalid TraceId in B3 header: null'. Returning INVALID span context.","throwable":""}
{"@timestamp":"2024-01-29T04:24:27,300","level":"DEBUG","thread":"vert.x-eventloop-thread-1","class":"B3PropagatorExtractorMultipleHeaders","message":"Invalid TraceId in B3 header: null'. Returning INVALID span context.","throwable":""}
{"@timestamp":"2024-01-29T04:24:27,676","level":"DEBUG","thread":"vert.x-eventloop-thread-1","class":"B3PropagatorExtractorMultipleHeaders","message":"Invalid TraceId in B3 header: null'. Returning INVALID span context.","throwable":""}

Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
@fab-10 fab-10 merged commit a563cf2 into besu-eth:main Jan 30, 2024
@fab-10 fab-10 deleted the upgrade-prometheus branch January 30, 2024 10:35
jflo pushed a commit to jflo/besu that referenced this pull request Jan 30, 2024
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
Co-authored-by: Stefan Pingel <16143240+pinges@users.noreply.github.com>
Signed-off-by: Justin Florentine <justin+github@florentine.us>
@fab-10 fab-10 restored the upgrade-prometheus branch January 31, 2024 13:02
@fab-10 fab-10 deleted the upgrade-prometheus branch January 31, 2024 13:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants