Skip to content

fix: remove EdgeConnect finalizer only after tenant cleanup succeeds#6534

Open
knQzx wants to merge 1 commit intoDynatrace:mainfrom
knQzx:fix/edgeconnect-finalizer-order
Open

fix: remove EdgeConnect finalizer only after tenant cleanup succeeds#6534
knQzx wants to merge 1 commit intoDynatrace:mainfrom
knQzx:fix/edgeconnect-finalizer-order

Conversation

@knQzx
Copy link
Copy Markdown

@knQzx knQzx commented May 1, 2026

description

reconcileEdgeConnectDeletion clears the finalizer and persists it BEFORE running tenant cleanup (buildEdgeConnectClient, getEdgeConnectByName, deleteConnectionSetting, DeleteEdgeConnect). if any step fails the k8s object has already lost its finalizer and gets garbage collected. tenant resource is orphaned forever

root cause

been there since 8fa3f094d ("Add edgeconnect provisioner mode") - no test for the failure path, no comment, no rationale

fix

  • run tenant cleanup first, only controllerutil.RemoveFinalizer after success
  • pull tenant cleanup into a cleanupTenantEdgeConnect helper
  • switch from Finalizers = nil to controllerutil.RemoveFinalizer so third-party finalizers don't get wiped

how can this be tested?

existing edgeconnect tests pass

two new regression tests in controller_test.go:

  • finalizer is kept when DeleteEdgeConnect fails
  • finalizer is kept when buildEdgeConnectClient fails

both fail on main (object gets GC'd, finalizer gone) and pass on this branch

note for operators

if the OAuth secret or caCertsRef configmap is deleted before the EdgeConnect CR, buildEdgeConnectClient fails and deletion gets retried indefinitely. recovery: recreate the secret/configmap or kubectl patch the finalizer off

reconcileEdgeConnectDeletion previously cleared the finalizer before
talking to the tenant. if buildEdgeConnectClient, getEdgeConnectByName
or DeleteEdgeConnect failed (oauth secret rotated, tenant unreachable,
etc.) the k8s object was already gc-able while the tenant resource
stayed orphaned forever.

now we run tenant cleanup first and only call RemoveFinalizer after it
succeeds, so a transient failure leaves the finalizer in place and the
controller retries on the next reconcile.
@knQzx knQzx requested a review from a team as a code owner May 1, 2026 11:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant