Skip to content

Sync: Gracefully handle blocks from an unknown fork#11085

Merged
bkchr merged 6 commits intomasterfrom
bkchr-sync-fork-fix
Feb 20, 2026
Merged

Sync: Gracefully handle blocks from an unknown fork#11085
bkchr merged 6 commits intomasterfrom
bkchr-sync-fork-fix

Conversation

@bkchr
Copy link
Copy Markdown
Member

@bkchr bkchr commented Feb 17, 2026

There is the possibility that node A connects to node B. Both are at the same best block (20). Shortly after this, node B announces a block 21 that is from a completely different fork (started at e.g. block 15). Right now this leads to node A downloading this block 21 and then failing to import it because it doesn't have the parent block.

This pull request solves this situation by putting the peer into ancestry search when it detects a fork that is "unknown".

@bkchr bkchr added the T0-node This PR/Issue is related to the topic “node”. label Feb 17, 2026
@bkchr
Copy link
Copy Markdown
Member Author

bkchr commented Feb 17, 2026

/cmd prdoc --audience node_dev --bump patch

if !continues_known_fork {
let current = number.min(best_queued_number);
peer.common_number = peer.common_number.min(self.client.info().finalized_number);
peer.state = PeerSyncState::AncestorSearch {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get this peer stuck in an AncestorSearch? What would be the worst case if peer B is maliciously advertising a block 21? Could it force us to go back to genesis (e.g., if the block is 1M and on a malicious fork)?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The node can not go back to block 21, especially if this block is below the last finalized block. Ancestry search is always the state with one peer and not with all peers together. So, if we are doing ancestry search with B, we can still import blocks from other peers.

peer.update_common_number(number.saturating_sub(One::one()));
}

// If this announced block isn't following any known fork, we have do start an ancestor
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: / we have do start / we have to start

Comment on lines +519 to +524
// The node is continuing a known fork if either the block itself is known, the parent is
// known or the block references the previously announced `best_hash`.
let continues_known_fork =
known || known_parent || announce.header.parent_hash() == &peer.best_hash;

let best_queued_number = self.best_queued_number;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: these can be moved inside the if is_best branch

@@ -521,11 +534,29 @@ where
// is either one further ahead or it's the one they just announced, if we know about it.
if is_best {
if known && self.best_queued_number >= number {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: here and below could use best_queued_number or remove the let best_queued_number?

let mut branch1 = None;
for i in 0..2 {
let at = if i == 0 { BlockId::Number(10) } else { BlockId::Hash(branch1.unwrap()) };
branch1 = net.peer(0).push_blocks_at(at, 1, i == 0).pop();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dq: why not true instead of i == 0 (ie just the first block has tx)?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test block builder failed. I did not wanted to dig deeper xD

) -> Result<ImportResult, Self::Error> {
self.inner.check_block(block).await
let result = self.inner.check_block(block).await;
if !matches!(result, Ok(ImportResult::Imported(_) | ImportResult::AlreadyInChain)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Should KnownBad also be tolerated here?

// The node is continuing a known fork if either the block itself is known, the parent is
// known or the block references the previously announced `best_hash`.
let continues_known_fork =
known || known_parent || announce.header.parent_hash() == &peer.best_hash;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we run into a race condition here? Something like a previous block triggered ancestor search and peer.best_hash is already set. But then next block gets announced and this condition would be true.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the peer is in ancestry search mode, this method aborts early (check above).


// The node is continuing a known fork if either the block itself is known, the
// parent is known or the block references the previously announced `best_hash`.
let continues_known_fork =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about moving this inside

if is_best {
    let continues_known_fork =
        known || known_parent || announce.header.parent_hash() == &peer.best_hash;
    (...)
}

where it is only used?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work, because then peer.best_hash maybe is already updated.

@bkchr bkchr added this pull request to the merge queue Feb 20, 2026
Merged via the queue into master with commit 3f9ad6a Feb 20, 2026
246 of 251 checks passed
@bkchr bkchr deleted the bkchr-sync-fork-fix branch February 20, 2026 23:31
lexnv added a commit that referenced this pull request Mar 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

T0-node This PR/Issue is related to the topic “node”.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants