Skip to content

Conversation

jparrill
Copy link
Contributor

What this PR does / why we need it

  • Increase retry attempts from 5 to 10 for better conflict handling
  • Reduce initial retry delay from 1s to 500ms for faster resolution
  • Add 30 second cap to maximum wait time to prevent excessive delays
  • Improve logging messages for better debugging and monitoring
  • Add success confirmation logs for completed operations
  • Maintain consistent retry logic across both UpdateHostedCluster and UpdateNodepools functions

This addresses the 'object has been modified' conflicts that occur during backup/restore operations when multiple processes modify HyperShift resources concurrently. The enhanced retry mechanism provides more robust handling of race conditions and controller conflicts.

Which issue(s) this PR fixes

Fixes #OCPBUGS-60684

- Increase retry attempts from 5 to 10 for better conflict handling
- Reduce initial retry delay from 1s to 500ms for faster resolution
- Add 30 second cap to maximum wait time to prevent excessive delays
- Improve logging messages for better debugging and monitoring
- Add success confirmation logs for completed operations
- Maintain consistent retry logic across both UpdateHostedCluster and UpdateNodepools functions

This addresses the 'object has been modified' conflicts that occur during
backup/restore operations when multiple processes modify HyperShift resources
concurrently. The enhanced retry mechanism provides more robust handling
of race conditions and controller conflicts.

Signed-off-by: Juan Manuel Parrilla Madrid <[email protected]>
@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Aug 20, 2025
@openshift-ci-robot
Copy link

@jparrill: This pull request references Jira Issue OCPBUGS-60684, which is invalid:

  • expected the bug to target the "4.20.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

What this PR does / why we need it

  • Increase retry attempts from 5 to 10 for better conflict handling
  • Reduce initial retry delay from 1s to 500ms for faster resolution
  • Add 30 second cap to maximum wait time to prevent excessive delays
  • Improve logging messages for better debugging and monitoring
  • Add success confirmation logs for completed operations
  • Maintain consistent retry logic across both UpdateHostedCluster and UpdateNodepools functions

This addresses the 'object has been modified' conflicts that occur during backup/restore operations when multiple processes modify HyperShift resources concurrently. The enhanced retry mechanism provides more robust handling of race conditions and controller conflicts.

Which issue(s) this PR fixes

Fixes #OCPBUGS-60684

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from bryan-cox and celebdor August 20, 2025 13:27
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 20, 2025
@jparrill
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot
Copy link

@jparrill: This pull request references Jira Issue OCPBUGS-60684, which is invalid:

  • expected the bug to target the "4.20.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.


for _, hc := range hostedClusters.Items {
// Create a retry loop with exponential backoff
// Create a retry loop with improved backoff for better conflict resolution
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like an AI updated comment :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually ask for a review before commit the code xD

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 20, 2025
Copy link

openshift-ci bot commented Aug 20, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jparrill, kaovilai

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

openshift-ci bot commented Aug 20, 2025

@jparrill: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jparrill
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot
Copy link

@jparrill: This pull request references Jira Issue OCPBUGS-60684, which is invalid:

  • expected the bug to target the "4.20.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jparrill
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Aug 20, 2025
@openshift-ci-robot
Copy link

@jparrill: This pull request references Jira Issue OCPBUGS-60684, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.20.0) matches configured target version for branch (4.20.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jparrill
Copy link
Contributor Author

/jira backport oadp-1.5

@openshift-ci-robot
Copy link

@jparrill: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick oadp-1.5

In response to this:

/jira backport oadp-1.5

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-cherrypick-robot

@openshift-ci-robot: once the present PR merges, I will cherry-pick it on top of oadp-1.5 in a new PR and assign it to you.

In response to this:

@jparrill: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick oadp-1.5

In response to this:

/jira backport oadp-1.5

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-bot openshift-merge-bot bot merged commit f8f37e2 into openshift:main Aug 20, 2025
7 checks passed
@openshift-ci-robot
Copy link

@jparrill: Jira Issue OCPBUGS-60684: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-60684 has been moved to the MODIFIED state.

In response to this:

What this PR does / why we need it

  • Increase retry attempts from 5 to 10 for better conflict handling
  • Reduce initial retry delay from 1s to 500ms for faster resolution
  • Add 30 second cap to maximum wait time to prevent excessive delays
  • Improve logging messages for better debugging and monitoring
  • Add success confirmation logs for completed operations
  • Maintain consistent retry logic across both UpdateHostedCluster and UpdateNodepools functions

This addresses the 'object has been modified' conflicts that occur during backup/restore operations when multiple processes modify HyperShift resources concurrently. The enhanced retry mechanism provides more robust handling of race conditions and controller conflicts.

Which issue(s) this PR fixes

Fixes #OCPBUGS-60684

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-cherrypick-robot

@openshift-ci-robot: new pull request created: #104

In response to this:

@jparrill: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick oadp-1.5

In response to this:

/jira backport oadp-1.5

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants