Skip to content

[FIXED] Prevent stale consumer state losing consumers#7905

Merged
neilalexander merged 2 commits intomainfrom
wq/consumer-snap-err
Mar 5, 2026
Merged

[FIXED] Prevent stale consumer state losing consumers#7905
neilalexander merged 2 commits intomainfrom
wq/consumer-snap-err

Conversation

@wallyqs
Copy link
Copy Markdown
Member

@wallyqs wallyqs commented Mar 4, 2026

When processClusterCreateConsumer applies initial state from a consumer assignment with stale state, setStoreState calls store.Update can return an error if the consumer already has newer state loaded from disk. Previously this error propagated up and caused the consumer to be stopped and its Raft node deleted, making it disappear after the server restarts after logging:

... error on store update from snapshot entry: old update ignored

This change adds a ErrStoreOldUpdate error which is now treated as a non-error in setStoreState since the consumer store would already have newer correct state.

Signed-off-by: Waldemar Quevedo wally@nats.io

When `processClusterCreateConsumer` applies initial state from a
snapshot during recovery, `setStoreState` calls `store.Update` which can
return `old update ignored` if the consumer already has newer state.
Previously this error propagated up and caused the consumer
to be stopped and its Raft node deleted, making it disappear.

Signed-off-by: Waldemar Quevedo <wally@nats.io>
@wallyqs wallyqs force-pushed the wq/consumer-snap-err branch from c349505 to a709932 Compare March 4, 2026 19:41
@wallyqs wallyqs marked this pull request as ready for review March 4, 2026 19:43
@wallyqs wallyqs requested a review from a team as a code owner March 4, 2026 19:43
@wallyqs
Copy link
Copy Markdown
Member Author

wallyqs commented Mar 4, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 👍

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@wallyqs wallyqs changed the title Prevent stale consumer state losing consumers on restart Prevent stale consumer state losing consumers during async snapshot Mar 4, 2026
@MauriceVanVeen MauriceVanVeen changed the title Prevent stale consumer state losing consumers during async snapshot [FIXED] Prevent stale consumer state losing consumers Mar 5, 2026
Copy link
Copy Markdown
Member

@MauriceVanVeen MauriceVanVeen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MauriceVanVeen
Copy link
Copy Markdown
Member

Could you also add a similar ErrStoreOldUpdate check in jetstream_cluster.go? If replaying entries from a snapshot if there's more recent state on disk, we would otherwise log about this but it wouldn't actually be an issue.

if err = o.store.Update(state); err != nil && err != ErrStoreOldUpdate {

main...wq/async-snap-consumer-err#diff-5cb252c37caef12e7027803018861c82724b120ddb62cfedc2f36addf57f6970R6440

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
Copy link
Copy Markdown
Member

@neilalexander neilalexander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@neilalexander neilalexander merged commit 1ce56f7 into main Mar 5, 2026
68 of 70 checks passed
@neilalexander neilalexander deleted the wq/consumer-snap-err branch March 5, 2026 10:16
neilalexander added a commit that referenced this pull request Mar 5, 2026
Related to #7905, the meta
snapshots should never contain `ca.State` as this is only used to signal
updating to specific state in the apply path, not during recovery from a
snapshot.

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
zenozaga pushed a commit to zenozaga/nats-server that referenced this pull request Mar 6, 2026
Related to nats-io#7905, the meta
snapshots should never contain `ca.State` as this is only used to signal
updating to specific state in the apply path, not during recovery from a
snapshot.

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
Signed-off-by: Randy stiven Valentin <zenozaga@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants