OCPBUGS-60684: feat: improve conflict resolution for HostedCluster and NodePool updates #103

jparrill · 2025-08-20T13:26:21Z

What this PR does / why we need it

Increase retry attempts from 5 to 10 for better conflict handling
Reduce initial retry delay from 1s to 500ms for faster resolution
Add 30 second cap to maximum wait time to prevent excessive delays
Improve logging messages for better debugging and monitoring
Add success confirmation logs for completed operations
Maintain consistent retry logic across both UpdateHostedCluster and UpdateNodepools functions

This addresses the 'object has been modified' conflicts that occur during backup/restore operations when multiple processes modify HyperShift resources concurrently. The enhanced retry mechanism provides more robust handling of race conditions and controller conflicts.

Which issue(s) this PR fixes

Fixes #OCPBUGS-60684

- Increase retry attempts from 5 to 10 for better conflict handling - Reduce initial retry delay from 1s to 500ms for faster resolution - Add 30 second cap to maximum wait time to prevent excessive delays - Improve logging messages for better debugging and monitoring - Add success confirmation logs for completed operations - Maintain consistent retry logic across both UpdateHostedCluster and UpdateNodepools functions This addresses the 'object has been modified' conflicts that occur during backup/restore operations when multiple processes modify HyperShift resources concurrently. The enhanced retry mechanism provides more robust handling of race conditions and controller conflicts. Signed-off-by: Juan Manuel Parrilla Madrid <[email protected]>

openshift-ci-robot · 2025-08-20T13:26:27Z

@jparrill: This pull request references Jira Issue OCPBUGS-60684, which is invalid:

expected the bug to target the "4.20.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

What this PR does / why we need it

Increase retry attempts from 5 to 10 for better conflict handling

Reduce initial retry delay from 1s to 500ms for faster resolution

Add 30 second cap to maximum wait time to prevent excessive delays

Improve logging messages for better debugging and monitoring

Add success confirmation logs for completed operations

Maintain consistent retry logic across both UpdateHostedCluster and UpdateNodepools functions

This addresses the 'object has been modified' conflicts that occur during backup/restore operations when multiple processes modify HyperShift resources concurrently. The enhanced retry mechanism provides more robust handling of race conditions and controller conflicts.

Which issue(s) this PR fixes

Fixes #OCPBUGS-60684

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

jparrill · 2025-08-20T13:28:40Z

/jira refresh

openshift-ci-robot · 2025-08-20T13:28:44Z

@jparrill: This pull request references Jira Issue OCPBUGS-60684, which is invalid:

expected the bug to target the "4.20.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

kaovilai · 2025-08-20T13:29:18Z

pkg/common/utils.go


 	for _, hc := range hostedClusters.Items {
-		// Create a retry loop with exponential backoff
+		// Create a retry loop with improved backoff for better conflict resolution


This sounds like an AI updated comment :)

I usually ask for a review before commit the code xD

openshift-ci · 2025-08-20T13:29:53Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jparrill, kaovilai

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [jparrill,kaovilai]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2025-08-20T13:44:54Z

@jparrill: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

jparrill · 2025-08-20T13:57:32Z

/jira refresh

openshift-ci-robot · 2025-08-20T13:57:36Z

@jparrill: This pull request references Jira Issue OCPBUGS-60684, which is invalid:

expected the bug to target the "4.20.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

jparrill · 2025-08-20T13:58:12Z

/jira refresh

openshift-ci-robot · 2025-08-20T13:58:20Z

@jparrill: This pull request references Jira Issue OCPBUGS-60684, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (4.20.0) matches configured target version for branch (4.20.0)
bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

jparrill · 2025-08-20T13:58:40Z

/jira backport oadp-1.5

openshift-ci-robot · 2025-08-20T13:58:44Z

@jparrill: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick oadp-1.5

In response to this:

/jira backport oadp-1.5

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-cherrypick-robot · 2025-08-20T13:58:48Z

@openshift-ci-robot: once the present PR merges, I will cherry-pick it on top of oadp-1.5 in a new PR and assign it to you.

In response to this:

@jparrill: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick oadp-1.5

In response to this:

/jira backport oadp-1.5

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci-robot · 2025-08-20T14:02:55Z

@jparrill: Jira Issue OCPBUGS-60684: All pull requests linked via external trackers have merged:

openshift/hypershift-oadp-plugin#103

Jira Issue OCPBUGS-60684 has been moved to the MODIFIED state.

In response to this:

What this PR does / why we need it

Increase retry attempts from 5 to 10 for better conflict handling

Reduce initial retry delay from 1s to 500ms for faster resolution

Add 30 second cap to maximum wait time to prevent excessive delays

Improve logging messages for better debugging and monitoring

Add success confirmation logs for completed operations

Maintain consistent retry logic across both UpdateHostedCluster and UpdateNodepools functions

This addresses the 'object has been modified' conflicts that occur during backup/restore operations when multiple processes modify HyperShift resources concurrently. The enhanced retry mechanism provides more robust handling of race conditions and controller conflicts.

Which issue(s) this PR fixes

Fixes #OCPBUGS-60684

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-cherrypick-robot · 2025-08-20T14:03:38Z

@openshift-ci-robot: new pull request created: #104

In response to this:

@jparrill: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick oadp-1.5

In response to this:

/jira backport oadp-1.5

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Aug 20, 2025

openshift-ci bot requested review from bryan-cox and celebdor August 20, 2025 13:27

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 20, 2025

kaovilai approved these changes Aug 20, 2025

View reviewed changes

openshift-ci bot assigned kaovilai Aug 20, 2025

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 20, 2025

openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Aug 20, 2025

openshift-merge-bot bot merged commit f8f37e2 into openshift:main Aug 20, 2025
7 checks passed

openshift-cherrypick-robot mentioned this pull request Aug 20, 2025

[oadp-1.5] : feat: improve conflict resolution for HostedCluster and NodePool updates #104

Merged

OCPBUGS-60684: feat: improve conflict resolution for HostedCluster and NodePool updates #103

OCPBUGS-60684: feat: improve conflict resolution for HostedCluster and NodePool updates #103

Uh oh!

Conversation

jparrill commented Aug 20, 2025

What this PR does / why we need it

Which issue(s) this PR fixes

Uh oh!

openshift-ci-robot commented Aug 20, 2025

What this PR does / why we need it

Which issue(s) this PR fixes

Uh oh!

jparrill commented Aug 20, 2025

Uh oh!

openshift-ci-robot commented Aug 20, 2025

Uh oh!

kaovilai Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

jparrill Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented Aug 20, 2025

Uh oh!

openshift-ci bot commented Aug 20, 2025

Uh oh!

jparrill commented Aug 20, 2025

Uh oh!

openshift-ci-robot commented Aug 20, 2025

Uh oh!

jparrill commented Aug 20, 2025

Uh oh!

openshift-ci-robot commented Aug 20, 2025

Uh oh!

jparrill commented Aug 20, 2025

Uh oh!

openshift-ci-robot commented Aug 20, 2025

Uh oh!

openshift-cherrypick-robot commented Aug 20, 2025

Uh oh!

Uh oh!

openshift-ci-robot commented Aug 20, 2025

What this PR does / why we need it

Which issue(s) this PR fixes

Uh oh!

openshift-cherrypick-robot commented Aug 20, 2025

Uh oh!

Uh oh!