Skip to content

Fix infinite request loops in cached stores with retry-aware error handling#21920

Merged
davelopez merged 7 commits intogalaxyproject:devfrom
dannon:fix/keyed-cache-retry-handling
Feb 25, 2026
Merged

Fix infinite request loops in cached stores with retry-aware error handling#21920
davelopez merged 7 commits intogalaxyproject:devfrom
dannon:fix/keyed-cache-retry-handling

Conversation

@dannon
Copy link
Copy Markdown
Member

@dannon dannon commented Feb 25, 2026

Fixes #21886. Builds on the stable-branch fix in #21881 which addressed the useKeyedCache infinite loop, and extends the same error-tracking and retry pattern to the rest of the frontend caching infrastructure on dev.

The core problem is the same across all three caching patterns: a computed getter triggers a fetch when data is missing, the fetch fails, but nobody records the failure, so the next render re-triggers the fetch forever. #21881 solved this for useKeyedCache stores by adding loadingErrors tracking and a retry gate. This PR brings that fix forward to dev with additional improvements, and applies the same pattern to the two stores that roll their own caching.

For useKeyedCache stores, this PR adds ApiError and rethrowSimpleWithStatus so fetch handlers can preserve HTTP status codes from the API response. All 12 cache-backed fetch handlers across 7 files are updated to use this, which enables the retry logic to distinguish transient server errors (429, 5xx) from permanent ones (403, 404).

The historyStore has its own fetch-on-miss pattern where getHistoryById calls loadHistoryById for missing histories. Failures were thrown via rethrowSimple without being tracked, causing the same infinite loop. Now getHistoryByIdFromServer returns a GalaxyApiResult<T> discriminated union instead of throwing, errors are tracked in historyLoadErrors, and getHistoryById gates on them with the same retry semantics.

The collectionElementsStore had partial protection — getCollectionById and getDetailedCollectionById checked loadingCollectionElementsErrors before fetching — but errors were raw Error objects without HTTP status, so transient failures were treated the same as permanent ones. Now fetchCollectionDetails returns GalaxyApiResult<HDCADetailed>, errors carry status codes, and both getters use retry logic.

The retry helpers and GalaxyApiResult<T> type are extracted into simple-error.ts as shared infrastructure for all three patterns.

When a useKeyedCache fetch handler fails, getItemById previously saw
undefined in storedItems and re-triggered the fetch in an infinite loop.
Now the cache checks loadingErrors before initiating a new fetch — non-
retryable errors (plain Error or 4xx ApiError) permanently block further
attempts, while retryable server errors (429, 5xx) allow up to 3 retries
before giving up.

Adds ApiError class and rethrowSimpleWithStatus helper to simple-error.ts
so fetch handlers can preserve HTTP status codes from API responses.
Switches the 12 cache-backed fetch handlers across 6 store/api files
from rethrowSimple to rethrowSimpleWithStatus so that the keyedCache
retry logic can distinguish retryable server errors from permanent
client errors. The cancelWorkflowScheduling handler (a DELETE, not a
cache handler) keeps using rethrowSimple.
When a dataset fetch fails permanently (e.g. 404 or 403), the view now
displays an error alert instead of spinning forever waiting for data that
will never arrive.
@dannon dannon force-pushed the fix/keyed-cache-retry-handling branch from 3c4ce5e to 557666e Compare February 25, 2026 00:29
…rror

Move RETRYABLE_STATUSES, MAX_RETRIES, and isRetryableApiError from
keyedCache.ts into simple-error.ts so they can be reused by other stores
that need the same retry logic. Also add GalaxyApiResult<T> discriminated
union type which makes API errors visible in function signatures instead
of relying on thrown exceptions.
getHistoryById triggers loadHistoryById when a history is missing, but if
the fetch fails the error was thrown via rethrowSimple without being
tracked. The computed would re-trigger the fetch on every render, causing
an infinite request loop. Now getHistoryByIdFromServer returns a
GalaxyApiResult instead of throwing, loadHistoryById tracks errors in
historyLoadErrors, and getHistoryById gates on existing errors with retry
logic for transient HTTP status codes (429, 5xx).
fetchCollectionDetails now returns GalaxyApiResult instead of throwing,
which lets collectionElementsStore track errors as ApiError with HTTP
status codes. getCollectionById and getDetailedCollectionById now gate on
existing errors with retry logic for transient statuses (429, 5xx),
matching the pattern from keyedCache and historyStore. Updated all
callers of fetchCollectionDetails and loadHistoryById to handle the new
return types.
@dannon dannon changed the title Consistent retry handling for useKeyedCache stores Fix infinite request loops in cached stores with retry-aware error handling Feb 25, 2026
getHistoryByIdFromServer now returns GalaxyApiResult so the mock needs
to wrap the history in { data, error } shape.
Copy link
Copy Markdown
Contributor

@davelopez davelopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! Thank you!

@davelopez davelopez merged commit 84b7011 into galaxyproject:dev Feb 25, 2026
28 checks passed
@github-project-automation github-project-automation Bot moved this from Needs Review to Done in Galaxy Dev - weeklies Feb 25, 2026
@github-actions
Copy link
Copy Markdown

This PR was merged without a "kind/" label, please correct.

@itisAliRH itisAliRH deleted the fix/keyed-cache-retry-handling branch March 9, 2026 10:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

Stores using useKeyedCache should use consistent retry handling

3 participants