Skip to content

Peering - disconnect peers that have multiple discovery ports#7089

Merged
pinges merged 11 commits intobesu-eth:mainfrom
pinges:InvestigatePeeringIssues
May 12, 2024
Merged

Peering - disconnect peers that have multiple discovery ports#7089
pinges merged 11 commits intobesu-eth:mainfrom
pinges:InvestigatePeeringIssues

Conversation

@pinges
Copy link
Copy Markdown
Contributor

@pinges pinges commented May 10, 2024

PR description

In discovery we seemed to see a lot of "misconfigured" nodes that share the same IP address and TCP port, but use different discovery ports. This PR checks for these nodes and removes them from the peer table.

Fixes #6805

pinges and others added 8 commits May 8, 2024 13:58
Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net>
Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net>
Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net>
Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net>
Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net>
Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net>
@macfarla
Copy link
Copy Markdown
Contributor

very cool find @pinges - are there more tests you wanted to add?

@macfarla
Copy link
Copy Markdown
Contributor

11 holesky nodes - 10 of them get to 100% peers within 2h
Screenshot 2024-05-10 at 2 43 23 PM

the 11th one eventually gets to 100% peers (although it does take ~7h)
Screenshot 2024-05-10 at 2 44 21 PM

Current behaviour on holesky is > 90% of nodes get stuck at 0 peers indefinitely (require setting static peers to sync)

See #6805

@macfarla
Copy link
Copy Markdown
Contributor

16 mainnet nodes - all of them get to 100% peers within 1h.
Screenshot 2024-05-10 at 2 48 38 PM

Current behaviour on mainnet is a small percentage - maybe 5% - of nodes get stuck on 0 peers and require a restart.
example - restarted at 23:00 which is when it starts to get peer connections.
Screenshot 2024-05-10 at 2 51 09 PM

@macfarla macfarla changed the title Investigate peering issues Peering - disconnect peers that have multiple discovery ports May 10, 2024
Copy link
Copy Markdown
Contributor

@macfarla macfarla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs a changelog entry

Copy link
Copy Markdown
Contributor

@macfarla macfarla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just needs a changelog entry

private final LHMWithMaxSize<String, Integer> ipAddressCheckMap =
new LHMWithMaxSize<>(DEFAULT_BUCKET_SIZE * N_BUCKETS);
private final CircularFifoQueue<String> invalidIPs =
new CircularFifoQueue<>(DEFAULT_BUCKET_SIZE * N_BUCKETS);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious - why these specific collection types?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CircularFifo is a simple queue with a maximum size.
LHMWithMaxSize is a private class, it is a LinkedHashMap that I have extended to have a maximum size.
Lukas complained about the name of the class and I will rename it :-)

Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
private final Map<Bytes, Integer> distanceCache;
private BloomFilter<Bytes> idBloom;
private int evictionCnt = 0;
private final LHMWithMaxSize<String, Integer> ipAddressCheckMap =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should rename LHMWithMaxSize as LinkedHashMapWithMaxSize to be clearer.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do :-)

pinges added 2 commits May 12, 2024 18:54
Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net>
@pinges pinges merged commit 5fa1750 into besu-eth:main May 12, 2024
jflo pushed a commit to jflo/besu that referenced this pull request May 28, 2024
…me IP and TCP port with different discovery ports (besu-eth#7089)

Find and remove peers from the peer table that share the same IP and TCP port with different discovery ports
---------

Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Justin Florentine <justin+github@florentine.us>
matthew1001 pushed a commit to kaleido-io/besu that referenced this pull request Jun 7, 2024
…me IP and TCP port with different discovery ports (besu-eth#7089)


Find and remove peers from the peer table that share the same IP and TCP port with different discovery ports
---------

Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com>
jflo pushed a commit to jflo/besu that referenced this pull request Jun 10, 2024
…me IP and TCP port with different discovery ports (besu-eth#7089)

Find and remove peers from the peer table that share the same IP and TCP port with different discovery ports
---------

Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Justin Florentine <justin+github@florentine.us>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Low Peer Numbers

3 participants