Skip to content

net/metrics: Add metrics for inbound/outbound traffic #10846

Merged
lexnv merged 5 commits intomasterfrom
lexnv/req-resp-metrics
Jan 23, 2026
Merged

net/metrics: Add metrics for inbound/outbound traffic #10846
lexnv merged 5 commits intomasterfrom
lexnv/req-resp-metrics

Conversation

@lexnv
Copy link
Copy Markdown
Contributor

@lexnv lexnv commented Jan 20, 2026

This PR adds a new metric for inbound / outbound traffic for individual request-response protocols.

  • the PR is motivated by Kusama OOM for rpc & boot node #10765 which shows a significant number of bytes as downloaded (4-5 MiB/s). This is suspicious for a fully synced validator, 1-2 blocks to the tip of the chain.
  • It suggests a protocol is internally consuming too much bandwidth leading to network inefficiencies, wasted CPU, and in the case of the issue to OOM kills

cc @paritytech/sdk-node

lexnv added 3 commits January 20, 2026 13:45
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
@lexnv lexnv self-assigned this Jan 20, 2026
@lexnv lexnv requested a review from a team January 20, 2026 13:55
@lexnv lexnv mentioned this pull request Jan 20, 2026
2 tasks
@lexnv lexnv added the T0-node This PR/Issue is related to the topic “node”. label Jan 20, 2026
}

/// Register inbound bytes (request payload received from peer) to Prometheus
pub fn register_inbound_request_bytes(&self, bytes: usize) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming nit:

Suggested change
pub fn register_inbound_request_bytes(&self, bytes: usize) {
pub fn inc_inbound_request_bytes(&self, bytes: usize) {

@lexnv
Copy link
Copy Markdown
Contributor Author

lexnv commented Jan 23, 2026

/cmd prdoc --audience node_dev --bump patch

@paritytech-workflow-stopper
Copy link
Copy Markdown

All GitHub workflows were cancelled due to failure one of the required jobs.
Failed workflow url: https://github.com/paritytech/polkadot-sdk/actions/runs/21281884388
Failed job name: test-linux-stable

@lexnv lexnv enabled auto-merge January 23, 2026 10:58
@lexnv lexnv added this pull request to the merge queue Jan 23, 2026
Merged via the queue into master with commit 5a27459 Jan 23, 2026
236 of 239 checks passed
@lexnv lexnv deleted the lexnv/req-resp-metrics branch January 23, 2026 12:16
github-merge-queue bot pushed a commit that referenced this pull request Jan 30, 2026
…st (#10886)

Under load, 5 seconds might not be enough for the CI to capture all
events generated by the net backend.

In this PR, the timeout is increased to 60s after which a hard panic
occurs.

Discovered during:
- #10846

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

T0-node This PR/Issue is related to the topic “node”.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants