NRG: Improve membership changes checks & leader is caught-up for forwarded proposals#7809
NRG: Improve membership changes checks & leader is caught-up for forwarded proposals#7809neilalexander merged 5 commits intomainfrom
Conversation
server/raft.go
Outdated
| return | ||
| } | ||
| prop := n.prop | ||
| n.membChanging = true |
There was a problem hiding this comment.
Setting the membChanging here is unnecessary. It may actually be harmful.
Setting membChanging to true is done in sendMembershipChange when EntryRemovePeer entries are picked up from the proposal queue and sent.
There was a problem hiding this comment.
Setting
membChangingto true is done insendMembershipChangewhenEntryRemovePeerentries are picked up from the proposal queue and sent.
This doesn't seem to be the case currently? Although, I agree it may be unsafe to mark membChanging before actually sending it.
Have updated the PR to only mark membChanging when sending (or when receiving on a follower), and only unset when applying it as a commit. This also means sendMembershipChange may receive multiple membership change proposals from the queue, but it now knows to ignore it if one was already sent previously.
4c04d98
There was a problem hiding this comment.
Yes, my comment was wrong. My worry was exactly that: pushing to n.prop does not mean that the message was sent, therefore a leader could set membChanging to true, and immediately become a follower. In that case a proposal to remove a peer could be dropped, without resetting membChanging to false. Your latest commit seems to fix that commit 4c04d98
There was a problem hiding this comment.
You could consider squashing the latest commit, together with the first.
There was a problem hiding this comment.
Think I'd rather keep them separate, since otherwise there's a bunch of conflicting changes to resolve.
4c04d98 to
c6326ed
Compare
sciascid
left a comment
There was a problem hiding this comment.
I think these changes could lead to a scenario where membChanging is set, while no membership change is in progress. For instance, if a leader manages to log a EntryPeerRemove, but its message is lost, and the leader changes. Upon leader change, the uncommitted entry may need to be delete from the log of the old leader. At this point, the old leader has membChanging set, but no membership change is in progress.
Should leadership go back to this node, new membership changes would be rejected...
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
c6326ed to
2102261
Compare
|
Have updated the PR to reset |
2102261 to
3cae01c
Compare
sciascid
left a comment
There was a problem hiding this comment.
Nice patch set!
Optional: consider renaming membChanging. The name makes me think that this is a boolean. But now it is an index. I'm thinking membChangeIndex... or something along those lines.
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
3cae01c to
f8f86a2
Compare
This PR improves a couple areas in our Raft logic:
logContainsUncommittedMembershipChangeafter we've won the leader election, we rely on the scan we do during startup of the Raft node and setn.membChangingif we process such an append entry. If the scan oflogContainsUncommittedMembershipChangetakes near the minimum election timeout, we could continuously cycle leader changes without making progress.n.commitbased on what's in its log, and not be immediately overwhelmed by loads of forwarded proposals (for example due to inactive thresholds deleting consumers). This should help limit the growth of the log if many forwarded proposals are retried.Signed-off-by: Maurice van Veen github@mauricevanveen.com