Skip to content

Conversation

@SkArchon
Copy link
Contributor

@SkArchon SkArchon commented Dec 17, 2025

This PR does the following listed

  • Makes error accessible to custom modules, while Error() will return nil since they run too early in our hooks before SetError() is called in our router, users can now wrap a custom response body which will be invoked much later after SetError() is called in the router

  • We remove the defer propagateSubgraphErrors, which means that the new behaviour is that in case of an early return due to an error, the final error set to to errors is not guaranteed to be the ones returned from the subgraph (previously this was guaranteed).

Summary by CodeRabbit

  • New Features

    • Adds automatic X-Has-Error response header and exposes an error accessor on the request context.
  • Bug Fixes

    • Ensures upstream/subgraph errors are tracked and surfaced in request handling so they influence response headers and final reporting.
  • Tests

    • Adds end-to-end tests covering auth failures, upstream errors, successful paths, and header behavior.

✏️ Tip: You can customize this high-level summary in your review settings.

Checklist

@coderabbitai
Copy link

coderabbitai bot commented Dec 17, 2025

Walkthrough

Adds RequestContext.Error(), records subgraph errors into request context during GraphQL handling (via explicit tracking), adds a ContextErrorModule that wraps ResponseWriters to set an X-Has-Error header when context errors exist, and adds tests covering auth, subgraph failure, baseline, and success scenarios.

Changes

Cohort / File(s) Summary
Context Error API
router/core/context.go
Add Error() error to the RequestContext interface and implement it on requestContext to expose request-scoped errors.
GraphQL error tracking
router/core/graphql_handler.go
Remove deferred propagateSubgraphErrors(...); when SubgraphErrors exist, explicitly call trackFinalResponseError(resolveCtx.Context(), errs) (in addition to existing header/cache handling) to record errors on the request context.
ContextErrorModule implementation
router-tests/modules/context-error/module.go
New ContextErrorModule that wraps http.ResponseWriter with headerCapturingWriter to detect RequestContext.Error(), set an X-Has-Error header, and implement RouterOnRequest, Module(), and an interface guard.
Tests for ContextErrorModule
router-tests/modules/context_error_field_test.go
New test TestContextErrorModule with subtests for auth failure, subgraph failure, baseline, and success; asserts response status, body contents, and X-Has-Error header behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Review response-writer wrapper: correctness of WriteHeader/Write/Flush ordering, streaming safety, and header injection timing.
  • Verify RequestContext.Error() usage and that error tracking (trackFinalResponseError) mirrors intended semantics and concurrency expectations.
  • Validate tests accurately simulate subgraph/auth failures and assert header presence/absence at the proper lifecycle point.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: making router errors accessible to custom modules through a new Error() method on RequestContext.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

github-actions bot commented Dec 17, 2025

Router-nonroot image scan passed

✅ No security vulnerabilities found in image:

ghcr.io/wundergraph/cosmo/router:sha-2c1759652a93627878b08a28cf415ad7b2caa1a1-nonroot

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
router/core/context.go (1)

478-481: Consider documenting the Error() method semantics.

The implementation is straightforward, but adding a comment would clarify:

  • When this returns nil (e.g., no error has occurred yet, or before SetError is called)
  • Thread-safety guarantees (appears safe for single-threaded request handling)

Example documentation:

+// Error returns the error associated with this request context, or nil if no error has been set.
+// This method is safe to call from the request handler goroutine.
 func (c *requestContext) Error() error {
 	return c.error
 }
router/core/graphql_handler.go (1)

193-196: Core fix looks good; consider clarifying the comment.

The explicit SetError(errs) call before writing the response is the key change that enables custom modules to access errors. This correctly addresses the timing issue described in the PR.

The comment could be slightly clearer about why we're calling SetError here in addition to the defer:

-// This is recorded in the defer above, but this is used to make sure the error is accessible to
-// any response writer wrappers set in custom modules that need it when WriteTo is called
+// Propagate error to request context before writing response so custom modules
+// can access it via response writer wrappers. The defer above handles other error propagation.
 reqCtx.SetError(errs)
router-tests/modules/context-error/module.go (1)

19-26: Consider implementing additional optional http interfaces.

The headerCapturingWriter only implements http.ResponseWriter and http.Flusher. If the underlying writer supports optional interfaces like http.Hijacker (for WebSocket upgrades), http.Pusher (for HTTP/2 Server Push), or http.CloseNotifier, those capabilities will be lost when wrapping.

For a test module, this may be acceptable, but consider using a helper that preserves these interfaces if this pattern is used in production modules.

Example pattern for preserving interfaces:

// Check if underlying writer supports additional interfaces
type hijacker interface {
    http.ResponseWriter
    http.Hijacker
}

// Then conditionally implement based on what the underlying writer supports

Alternatively, consider using a library like httpsnoop that handles this automatically.

Also applies to: 63-76

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f70cae2 and 9ab859a.

📒 Files selected for processing (4)
  • router-tests/modules/context-error/module.go (1 hunks)
  • router-tests/modules/context_error_field_test.go (1 hunks)
  • router/core/context.go (2 hunks)
  • router/core/graphql_handler.go (1 hunks)
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.
📚 Learning: 2025-11-19T15:13:57.821Z
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.

Applied to files:

  • router/core/graphql_handler.go
  • router-tests/modules/context-error/module.go
  • router-tests/modules/context_error_field_test.go
📚 Learning: 2025-08-20T10:08:17.857Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2155
File: router/core/router.go:1857-1866
Timestamp: 2025-08-20T10:08:17.857Z
Learning: router/pkg/config/config.schema.json forbids null values for traffic_shaping.subgraphs: additionalProperties references $defs.traffic_shaping_subgraph_request_rule with type "object". Therefore, in core.NewSubgraphTransportOptions, dereferencing each subgraph rule pointer is safe under schema-validated configs, and a nil-check is unnecessary.

Applied to files:

  • router/core/graphql_handler.go
📚 Learning: 2025-09-17T20:55:39.456Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2172
File: router/core/graph_server.go:0-0
Timestamp: 2025-09-17T20:55:39.456Z
Learning: The Initialize method in router/internal/retrytransport/manager.go has been updated to properly handle feature-flag-only subgraphs by collecting subgraphs from both routerConfig.GetSubgraphs() and routerConfig.FeatureFlagConfigs.ConfigByFeatureFlagName, ensuring all subgraphs receive retry configuration.

Applied to files:

  • router/core/graphql_handler.go
📚 Learning: 2025-08-26T17:19:28.178Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2167
File: router/internal/expr/retry_context.go:3-10
Timestamp: 2025-08-26T17:19:28.178Z
Learning: In the Cosmo router codebase, prefer using netErr.Timeout() checks over explicit context.DeadlineExceeded checks for timeout detection in retry logic, as Go's networking stack typically wraps context deadline exceeded errors in proper net.Error implementations.

Applied to files:

  • router-tests/modules/context-error/module.go
🧬 Code graph analysis (2)
router-tests/modules/context-error/module.go (4)
router/core/modules.go (3)
  • RegisterModule (56-73)
  • Module (52-54)
  • ModuleInfo (44-50)
router/core/context.go (1)
  • RequestContext (61-144)
router/pkg/pubsub/datasource/error.go (1)
  • Error (3-6)
router/internal/expr/expr.go (1)
  • Request (66-75)
router-tests/modules/context_error_field_test.go (3)
router-tests/utils.go (1)
  • ConfigureAuth (45-57)
router-tests/modules/context-error/module.go (1)
  • ContextErrorModule (15-17)
router/core/router.go (2)
  • WithModulesConfig (1708-1712)
  • WithCustomModules (1815-1819)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: build-router
  • GitHub Check: image_scan
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: integration_test (./events)
  • GitHub Check: image_scan (nonroot)
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: build_push_image (nonroot)
  • GitHub Check: build_push_image
  • GitHub Check: build_test
  • GitHub Check: build_test
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
🔇 Additional comments (7)
router/core/context.go (1)

142-144: LGTM! Clean API addition for error propagation.

The new Error() error method on RequestContext provides a clean way for custom modules to access request errors. This aligns well with the PR's goal of making errors accessible to response-writer wrappers.

router-tests/modules/context_error_field_test.go (3)

62-98: LGTM! Subgraph error test validates the core fix.

This test correctly validates that subgraph errors are captured in the request context and exposed via the X-Has-Error header. The middleware that returns 500 with GraphQL errors simulates a real subgraph failure scenario.


100-124: LGTM! Good negative test case.

This test ensures that the X-Has-Error header is not present when requests succeed, validating that error detection only activates when actual errors occur.


20-60: SetError is correctly invoked for authentication failures.

The authentication error flow calls SetError in handleAuthenticationFailure() (graphql_prehandler.go:1142) during the prehandler phase. The test's custom module (context-error) wraps the response writer and checks ctx.Error() to detect when an error has been set, then sets the X-Has-Error header accordingly. This flow works as expected for authentication failures.

router-tests/modules/context-error/module.go (3)

28-54: LGTM! Clean error detection logic.

The checkAndSetError method efficiently checks for errors only once (using the hasError flag), and both WriteHeader and Write correctly invoke it before writing. The headerWritten flag properly prevents duplicate header writes.


56-61: Good streaming support.

Implementing http.Flusher is important for streaming responses and subscriptions. The type assertion and conditional call handle cases where the underlying writer doesn't support flushing.


78-88: Module registration looks correct.

The module is properly registered with a unique ID, priority 1 (runs early), and a constructor function. The interface guard at lines 91-93 ensures compile-time verification of the RouterOnRequestHandler implementation.

@codecov
Copy link

codecov bot commented Dec 17, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.75%. Comparing base (caaf2bd) to head (45ad2c6).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #2420       +/-   ##
===========================================
- Coverage   90.55%   60.75%   -29.81%     
===========================================
  Files          16      229      +213     
  Lines        3271    23818    +20547     
  Branches      871        0      -871     
===========================================
+ Hits         2962    14470    +11508     
- Misses        309     8101     +7792     
- Partials        0     1247     +1247     
Files with missing lines Coverage Δ
router/core/context.go 69.61% <100.00%> (ø)
router/core/graphql_handler.go 69.14% <100.00%> (ø)

... and 243 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@Noroth Noroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but please add docstrings to exported functions

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9ab859a and fc789a0.

📒 Files selected for processing (2)
  • router-tests/modules/context_error_field_test.go (1 hunks)
  • router/core/context.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • router/core/context.go
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.
📚 Learning: 2025-11-19T15:13:57.821Z
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.

Applied to files:

  • router-tests/modules/context_error_field_test.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: build-router
  • GitHub Check: build_push_image (nonroot)
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: build_push_image
  • GitHub Check: build_test
  • GitHub Check: image_scan (nonroot)
  • GitHub Check: integration_test (./events)
  • GitHub Check: image_scan
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
🔇 Additional comments (3)
router-tests/modules/context_error_field_test.go (3)

20-60: LGTM!

This test correctly validates that authentication failures are captured in the context and surfaced via the X-Has-Error header. The setup with access controller and invalid token is well-structured.


62-98: LGTM!

Good coverage for the subgraph failure scenario. The middleware correctly simulates an upstream error, and the assertions verify both the error propagation in the response body and the X-Has-Error header.


116-140: LGTM!

Good coverage for the success path, verifying that the X-Has-Error header is not set when no errors occur.

@SkArchon SkArchon force-pushed the milinda/make-error-accessible branch from fc789a0 to 9e46fa0 Compare December 17, 2025 11:39
@SkArchon SkArchon force-pushed the milinda/make-error-accessible branch from 9e46fa0 to 0b692a1 Compare December 17, 2025 12:55
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
router/core/graphql_handler.go (1)

211-211: Critical: Deferred error propagation still present, contradicting PR objectives.

The defer propagateSubgraphErrors(resolveCtx) call remains in the subscription case, but the PR description states "The change removes the previous deferred call to propagateSubgraphErrors." This is inconsistent with:

  1. The PR's core objective of making errors accessible to custom modules before they execute
  2. The fix applied at line 189 for the synchronous case
  3. StarpTech's previous feedback requesting removal of the defer

The deferred call means custom modules wrapping the response writer for subscriptions may still observe nil errors if they access Error() before the defer executes.

Based on learnings, as per past review feedback from StarpTech.

Apply this approach:

-		defer propagateSubgraphErrors(resolveCtx)
 		resolveCtx, writer, ok = GetSubscriptionResponseWriter(resolveCtx, r, w, h.apolloSubscriptionMultipartPrintBoundary)

Then add explicit trackFinalResponseError calls for subgraph errors at the appropriate points in the subscription flow, similar to line 189 in the sync case. This ensures errors are immediately available to custom modules.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0b692a1 and 45ad2c6.

📒 Files selected for processing (1)
  • router/core/graphql_handler.go (1 hunks)
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.
📚 Learning: 2025-11-19T15:13:57.821Z
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.

Applied to files:

  • router/core/graphql_handler.go
📚 Learning: 2025-08-20T10:08:17.857Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2155
File: router/core/router.go:1857-1866
Timestamp: 2025-08-20T10:08:17.857Z
Learning: router/pkg/config/config.schema.json forbids null values for traffic_shaping.subgraphs: additionalProperties references $defs.traffic_shaping_subgraph_request_rule with type "object". Therefore, in core.NewSubgraphTransportOptions, dereferencing each subgraph rule pointer is safe under schema-validated configs, and a nil-check is unnecessary.

Applied to files:

  • router/core/graphql_handler.go
📚 Learning: 2025-09-17T20:55:39.456Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2172
File: router/core/graph_server.go:0-0
Timestamp: 2025-09-17T20:55:39.456Z
Learning: The Initialize method in router/internal/retrytransport/manager.go has been updated to properly handle feature-flag-only subgraphs by collecting subgraphs from both routerConfig.GetSubgraphs() and routerConfig.FeatureFlagConfigs.ConfigByFeatureFlagName, ensuring all subgraphs receive retry configuration.

Applied to files:

  • router/core/graphql_handler.go
🧬 Code graph analysis (1)
router/core/graphql_handler.go (1)
router/internal/expr/expr.go (1)
  • Context (35-39)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: build-router
  • GitHub Check: build_push_image
  • GitHub Check: build_test
  • GitHub Check: image_scan
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: integration_test (./events)
  • GitHub Check: build_push_image (nonroot)
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: image_scan (nonroot)
  • GitHub Check: Analyze (go)
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (1)
router/core/graphql_handler.go (1)

188-191: LGTM: Explicit subgraph error tracking improves error accessibility.

The explicit call to trackFinalResponseError ensures that subgraph errors are immediately recorded in the request context, making them accessible to custom modules via RequestContext.Error(). This addresses the core issue where deferred error propagation caused custom modules to observe nil errors.

Copy link
Contributor

@StarpTech StarpTech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SkArchon SkArchon merged commit 8980fa8 into main Dec 18, 2025
32 checks passed
@SkArchon SkArchon deleted the milinda/make-error-accessible branch December 18, 2025 09:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants