Skip to content

zombienet test for measuring block propagation time #8956

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

sistemd
Copy link
Contributor

@sistemd sistemd commented Jun 23, 2025

We need a way to measure the impact of #65. This PR implements a zombienet test (ignored by default) which creates a sparsely connected network and measures block propagation time. To get a better average, the test runs 20 times, then logs the average and median block propagation times.

Here are some example results. There is still some spread between the measurements, but it's probably good enough to show a measurable improvement after #65 is implemented.

[INFO  lib::propagation_time] Average propagation time: 3.7399299124999996 seconds
[INFO  lib::propagation_time] Median propagation time: 4.153729524999999 seconds

[INFO  lib::propagation_time] Average propagation time: 3.0715945434444443 seconds
[INFO  lib::propagation_time] Median propagation time: 4.087850897 seconds

[INFO  lib::propagation_time] Average propagation time: 3.784516305428571 seconds
[INFO  lib::propagation_time] Median propagation time: 4.139099133 seconds

[INFO  lib::propagation_time] Average propagation time: 3.5878196885555553 seconds
[INFO  lib::propagation_time] Median propagation time: 3.432116791 seconds

[INFO  lib::propagation_time] Average propagation time: 4.551509407875001 seconds
[INFO  lib::propagation_time] Median propagation time: 4.9109004385 seconds

[INFO  lib::propagation_time] Average propagation time: 3.2545806564444444 seconds
[INFO  lib::propagation_time] Median propagation time: 3.516505925 seconds

[INFO  lib::propagation_time] Average propagation time: 4.010925618 seconds
[INFO  lib::propagation_time] Median propagation time: 4.239232554 seconds

[INFO  lib::propagation_time] Average propagation time: 3.772498191714285 seconds
[INFO  lib::propagation_time] Median propagation time: 4.134265307 seconds

[INFO  lib::propagation_time] Average propagation time: 4.389500488875 seconds
[INFO  lib::propagation_time] Median propagation time: 4.5417552285 seconds

[INFO  lib::propagation_time] Average propagation time: 4.219312816142857 seconds
[INFO  lib::propagation_time] Median propagation time: 4.938675248 seconds

Closes #8895.

@sistemd sistemd added T10-tests This PR/Issue is related to tests. R0-no-crate-publish-required The change does not require any crates to be re-published. labels Jun 24, 2025
@skunert skunert self-requested a review June 25, 2025 09:25
@michalkucharczyk
Copy link
Contributor

michalkucharczyk commented Jun 26, 2025

dq: should we throw some transactions to the blocks? This will impact the import time (as we would need to re-execute transactions before gossiping), so gains from improvements shall be much higher with full blocks. Does it make sense?

} else {
propagation_times[propagation_times.len() / 2]
};
log::info!("Median propagation time: {median} seconds");
Copy link
Contributor

@michalkucharczyk michalkucharczyk Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could aslo see p90 / p99. Maybe we'll get some improvements there. Also interesting to see the distribution.
This could be used for computing them.

Copy link
Contributor Author

@sistemd sistemd Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Printing the distribution and percentiles: bd21c5b

Here are some outputs:

[INFO  lib::propagation_time] Propagation times distribution: [2.263069223, 2.334497576, 2.375355276, 2.394275715, 3.076662812, 3.101237717, 3.193503226, 3.778434994, 3.818934525, 3.890964703]
[INFO  lib::propagation_time] Average propagation time: 3.0226935767 seconds
[INFO  lib::propagation_time] Median propagation time: 3.0889502645 seconds
[INFO  lib::propagation_time] 90th percentile propagation time: 3.864553637733333 seconds
[INFO  lib::propagation_time] 99th percentile propagation time: 3.890964703 seconds

[INFO  lib::propagation_time] Propagation times distribution: [0.82404183, 1.513064236, 2.345535011, 2.356429218, 2.405158283, 2.451870854, 3.063113397, 3.843502431, 3.909441943, 4.698152424]
[INFO  lib::propagation_time] Average propagation time: 2.7410309627 seconds
[INFO  lib::propagation_time] Median propagation time: 2.4285145685 seconds
[INFO  lib::propagation_time] 90th percentile propagation time: 4.408958580966668 seconds
[INFO  lib::propagation_time] 99th percentile propagation time: 4.698152424 seconds

[INFO  lib::propagation_time] Propagation times distribution: [1.43154837, 1.5166815919999999, 3.016164291, 3.094109591, 3.18914449, 3.268004095, 3.780973459, 3.940997147, 4.676256722, 4.692760257]
[INFO  lib::propagation_time] Average propagation time: 3.260664001400001 seconds
[INFO  lib::propagation_time] Median propagation time: 3.2285742925000003 seconds
[INFO  lib::propagation_time] 90th percentile propagation time: 4.686708960833333 seconds
[INFO  lib::propagation_time] 99th percentile propagation time: 4.692760257 seconds

[INFO  lib::propagation_time] Propagation times distribution: [0.731311724, 1.492740515, 1.607959545, 2.265557216, 2.984051086, 3.135189638, 3.215782294, 3.7648819209999997, 3.819687167, 3.92278444]
[INFO  lib::propagation_time] Average propagation time: 2.6939945546000006 seconds
[INFO  lib::propagation_time] Median propagation time: 3.059620362 seconds
[INFO  lib::propagation_time] 90th percentile propagation time: 3.884982106566667 seconds
[INFO  lib::propagation_time] 99th percentile propagation time: 3.92278444 seconds

It's still not as good as I would hope. I upped the test count from 10 to 20, we'll see if that improves the situation. (It certainly makes the test 2x slower 😄)

…nnected-peers-propagation-time-zombienet-test
@sistemd sistemd force-pushed the sistemd/sparsely-connected-peers-propagation-time-zombienet-test branch from 59c136f to 2659875 Compare June 26, 2025 17:25
@sistemd
Copy link
Contributor Author

sistemd commented Jun 27, 2025

dq: should we throw some transactions to the blocks? This will impact the import time (as we would need to re-execute transactions before gossiping), so gains from improvements shall be much higher with full blocks. Does it make sense?

I think it's a really good idea: c8f4678.

@sistemd sistemd requested a review from michalkucharczyk June 27, 2025 22:40
@sistemd sistemd force-pushed the sistemd/sparsely-connected-peers-propagation-time-zombienet-test branch from 3d3a39d to 2f761a8 Compare July 3, 2025 13:41
@paritytech-workflow-stopper
Copy link

All GitHub workflows were cancelled due to failure one of the required jobs.
Failed workflow url: https://github.com/paritytech/polkadot-sdk/actions/runs/16052071208
Failed job name: cargo-clippy

@sistemd sistemd force-pushed the sistemd/sparsely-connected-peers-propagation-time-zombienet-test branch from 0a64f85 to ff3fd3d Compare July 3, 2025 14:13
@sistemd
Copy link
Contributor Author

sistemd commented Jul 3, 2025

Using the glutton pallet: ff3fd3d.

@michalkucharczyk
Copy link
Contributor

I think it's a really good idea: c8f4678.

Did it actually increase timings? Would you share some updated output?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
R0-no-crate-publish-required The change does not require any crates to be re-published. T10-tests This PR/Issue is related to tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Figure out an approach for collecting block propagation data
2 participants