Skip to content

Shutdown while awaiting lockin #1313

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

rustyrussell
Copy link
Contributor

No description provided.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Signed-off-by: Rusty Russell <[email protected]>

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fixes: ElementsProject#1308
Signed-off-by: Rusty Russell <[email protected]>
Copy link
Contributor

@ZmnSCPxj ZmnSCPxj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK, mostly trusting Rusty superior knowledge of state machine though.

@ZmnSCPxj
Copy link
Contributor

ZmnSCPxj commented Apr 3, 2018

ACK ed767c8

Does this address #1308? Yes

Copy link
Member

@cdecker cdecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commits should probably be reordered (test needs to move down) otherwise we broke the bisectability of master.

FWIW, I don't think this addresses #1308 since that fails on the transition after channeld_shutting_down:

/* This sets channel->owner, closes down channeld. */
peer_start_closingd(channel, &cs, gossip_index, fds[0], fds[1], false);
channel_set_state(channel, CHANNELD_SHUTTING_DOWN, CLOSINGD_SIGEXCHANGE);

@@ -77,9 +77,9 @@ static void peer_got_shutdown(struct channel *channel, const u8 *msg)
}

/* If we weren't already shutting down, we are now */
if (channel->state == CHANNELD_NORMAL)
if (channel->state != CHANNELD_SHUTTING_DOWN)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might a bit too permissive now? We probably also don't want to accept in state sigexchange.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're in channeld (which we are), we have to be in AWAITING_LOCKIN or NORMAL or SHUTTING_DOWN. So this is actually safe.

@ZmnSCPxj
Copy link
Contributor

ZmnSCPxj commented Apr 3, 2018

I think the PR makes the state machine transition to CHANNELD_SHUTTING_DOWN from CHANNELD_AWAITING_LOCKIN, so that the code you quoted becomes valid.

@cdecker
Copy link
Member

cdecker commented Apr 3, 2018

Oh I see, we're forcing the channel state into shutting down, then tell channeld to die, and then start closingd directly afterwards. That was a level of indirection too many for me 😉

IMHO there still is a tiny probability that there is a channel state transition between receiving the shutdown and channeld dying (channel confirmation exactly at the wrong moment). So I'd prefer widening the condition or refusing to transition to a prior state after shutdown was reached.

@rustyrussell
Copy link
Contributor Author

It's channeld which tells us that it received shutdown, so it can't die at the same time?

The order is:

  1. channeld tells us it got the remote shutdown, we save the scriptpubkey for closing and update state to CHANNELD_SHUTTING_DOWN if not already done.
  2. When channeld has cleared all the HTLCs, it tells us shutdown is complete, we update state to CLOSINGD_SIGEXCHANGE and start closingd (killing channeld in the process).

Note that if we reconnect, the peer can re-send shutdown so we need to handle the case where we're already in CHANNELD_SHUTTING_DOWN.

We could eliminate the CHANNELD_SHUTTING_DOWN state and use the existence of the remote closing scriptpubkey as a flag, but as shutdown can be extended, I think having the state is clearer to users.

@rustyrussell
Copy link
Contributor Author

Oh, and on test order: I've taken to leaving the test-then-fix order in my pushes, using unittest.expectedFail() and removing it in the fix. That proves that I actually tested it before and after :)

@cdecker cdecker merged commit 1f9ad06 into ElementsProject:master Apr 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants