DurableAgent known limitations: sandbox CPU billing, abort signal, double billing, verify loop context #1800

shtefcs · 2026-04-16T02:03:47Z

shtefcs
Apr 16, 2026

Context

We are building a production multi-agent AI orchestration platform using DurableAgent from @workflow/ai. The migration from a monolithic "use step" approach to DurableAgent (per-LLM-call + per-tool step isolation) is working but has several architectural limitations that affect correctness and billing accuracy.

Related: #1737 (development mode performance), #1315 (linear step overhead growth), #1160 (queue delay)

Environment

workflow: 4.2.1
@workflow/ai: 4.1.1
@workflow/core: 4.2.1
Next.js: 16.1.6
AI SDK: 6.x

Limitation 1: No AbortSignal in V8 Sandbox (Credit Circuit Breaker)

The "use workflow" function runs in a V8 sandbox where AbortController does not exist. When a user's credit balance crosses the overdraft limit during onStepFinish, we cannot abort the DurableAgent mid-execution. We use a flag checked in prepareStep instead, which means the agent completes its current LLM call before stopping, potentially overspending by one response.

The original non-durable orchestrator uses AbortController.abort() to kill streamText immediately. DurableAgent accepts abortSignal in its stream() options, but creating an AbortController at workflow level throws ReferenceError.

Question: Is there a way to trigger DurableAgent's abort from within onStepFinish or prepareStep? Or could the V8 sandbox expose AbortController?

Limitation 2: Double Billing Risk on Lambda Crash + Retry

onStepFinish is a side-effect callback, not a workflow step. If a Lambda crashes after onStepFinish fires (credits deducted) but before the step result is persisted, the SDK retries the step. The retried step's onStepFinish fires again, deducting credits a second time.

The creditsAlreadyDeducted counter resets per-agent (necessary for chained agents), so it cannot detect the duplicate.

Question: Does DurableAgent suppress onStepFinish for cached/replayed steps? Or does it fire every time, even on replay? If the SDK could pass a "isReplay" flag to onStepFinish, we could skip billing on replays.

Limitation 3: Verify Loop Context (Coding Agent Fix Cycles)

After the coding agent completes, a verify loop checks the sandbox for errors and runs fix cycles using streamText. The fix cycle needs tool access (sandboxBash, sandboxWriteFile, etc.) to fix errors.

The problem: runVerifyLoopStep is a "use step" function that calls streamText with tools from buildDurableTools(config.toolSchemas). Those tools are themselves "use step" functions. This creates nested steps (a step calling sub-steps), which is not officially supported.

Additionally, the fix cycle uses config.messages (pre-agent messages) instead of result.messages (post-agent with all tool call results), so the fix agent cannot see what was already built.

Question: Is there a supported pattern for running a secondary agent loop after the primary DurableAgent completes? Could the verify loop be restructured as a separate workflow-level agent call instead of a nested step?

Limitation 4: Sandbox CPU Billing (Upstash Box)

Each sandbox tool wrapper reconnects to the Upstash Box via reconnectUpstashBox(boxId). The reconnected adapter creates a fresh activeCpuMs counter starting at 0. The original orchestrator maintains a single adapter across the entire session, accumulating CPU time accurately.

The Upstash Box client SDK does not expose server-side CPU metrics. The only way to track CPU is the adapter's local counter, which resets on each reconnection.

This means sandbox compute goes unbilled in durable mode. In production, this is a revenue leak.

Question: This is primarily an Upstash Box SDK limitation, but is there a way to maintain persistent state (like a CPU counter) across workflow steps without serializing the adapter itself?

Summary

Limitation	Impact	Workaround Available?
No AbortSignal in V8 sandbox	One extra LLM call before credit circuit breaker stops agent	Flag + prepareStep toolChoice:"none"
Double billing on retry	Potential 2x charge for crashed steps	None currently
Nested steps in verify loop	Not officially supported, fix agent lacks context	Works in practice but fragile
Sandbox CPU billing	Compute goes unbilled	None without server-side metrics

We are happy to contribute fixes or test proposed solutions. These are the last blockers before production deployment.

pranaygp · 2026-04-17T18:28:38Z

pranaygp
Apr 17, 2026
Collaborator

Thanks for opening this. Just a note that we've been working with the AI SDK team on migrating DurableAgent to WorkflowAgent in v7 (https://ai-sdk.dev/v7/docs/agents/workflow-agent). It's the same API but with even better coverage and more deeply integrated into AI sdk.

that said:

AbortSignal is in the works and will be shipped in the workflow v5 beta branch soon-> feat: serializable AbortController/AbortSignal #1301
does using the step ID for idempotency prevent this for you?
yes. that's just a pattern in your code I believe. you can trigger another durable agent back in the workflow when you're done - it doesn't have to be inside a step. also tools themselves don't have to be in steps. and tools can even start other nested/child workflows (https://workflow-sdk.dev/docs/foundations/common-patterns#workflow-composition)
yeah. keep the counter in the workflow let totalValue = 0; and inside the tools calls - don't make them steps. but isntead they can call reconnectUpstashBox which becomes a step. you return the billing time on each function, and then the tool call adds it to the global totalValue variable that's in the workflow - so the workflow accumulates the total value as they run

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DurableAgent known limitations: sandbox CPU billing, abort signal, double billing, verify loop context #1800

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

DurableAgent known limitations: sandbox CPU billing, abort signal, double billing, verify loop context #1800

Uh oh!

shtefcs Apr 16, 2026

Context

Environment

Limitation 1: No AbortSignal in V8 Sandbox (Credit Circuit Breaker)

Limitation 2: Double Billing Risk on Lambda Crash + Retry

Limitation 3: Verify Loop Context (Coding Agent Fix Cycles)

Limitation 4: Sandbox CPU Billing (Upstash Box)

Summary

Replies: 1 comment

Uh oh!

pranaygp Apr 17, 2026 Collaborator

shtefcs
Apr 16, 2026

pranaygp
Apr 17, 2026
Collaborator