Skip to content

Conversation

@Epsilon314
Copy link
Contributor

Pull Request Description

Ensure RoleSet will always be requeued for retry if it is not Ready after sync.

Related Issues

Resolves: #[Insert issue number(s)]

Important: Before submitting, please complete the description above and review the checklist below.


Contribution Guidelines (Expand for Details)

We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

  • [Bug]: Corrections to existing functionality
  • [CI]: Changes to build process or CI pipeline
  • [Docs]: Updates or additions to documentation
  • [API]: Modifications to aibrix's API or interface
  • [CLI]: Changes or additions to the Command Line Interface
  • [Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use multiple prefixes in order of importance.

Submission Checklist

  • PR title includes appropriate prefix(es)
  • Changes are clearly explained in the PR description
  • New and existing tests pass successfully
  • Code adheres to project style and best practices
  • Documentation updated to reflect changes (if applicable)
  • Thorough testing completed, no regressions introduced

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

Signed-off-by: zhuyiqing.wiz <[email protected]>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @Epsilon314, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a bug in the RoleSet controller where a RoleSet object might not be properly requeued for reconciliation if its status remains unchanged but it is not yet in a ready state. The change ensures that any RoleSet that is not ready after a sync operation will be explicitly requeued for a retry after a short delay, preventing it from getting stuck in an unready state.

Highlights

  • Bug Fix: Implemented a fix to ensure that RoleSet objects that are not in a Ready state after a reconciliation cycle are explicitly requeued for further processing, preventing them from getting stuck.
  • Reconciliation Logic: Modified the Reconcile function in roleset_controller.go to check the Ready status of a RoleSet even when its status has not semantically changed. If the RoleSet is not ready, it triggers a retry after a short delay.
  • New Constant: Introduced a new constant, DefaultRetryDelay, set to 1 second, to define the interval for requeuing non-ready RoleSet objects.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a fix to ensure that a RoleSet that is not in a Ready state is requeued for reconciliation, preventing it from getting stuck. The logic is sound and directly addresses the problem described.

My review includes a high-severity comment about the lack of automated tests for this new, important behavior, and a medium-severity suggestion to improve the clarity of a log message. Adding tests is crucial for maintaining the stability of the controller.

Comment on lines +152 to 156
if !orchestrationctrl.IsRoleSetReady(roleSet) {
klog.Infof("roleset %s/%s not ready, reconcile after %v seconds", roleSet.Namespace, roleSet.Name, DefaultRetryDelay)
return ctrl.Result{RequeueAfter: DefaultRetryDelay}, nil
}
return ctrl.Result{}, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This change correctly addresses the issue of non-ready RoleSets not being retried. However, this new logic is not covered by automated tests. The test suite in roleset_controller_test.go appears to be entirely skipped.

To prevent regressions and ensure the long-term correctness of this fix, please add a unit test that covers this specific scenario. The test should validate that when a RoleSet is not ready and its status is unchanged, the reconciler returns a result that requeues the request after DefaultRetryDelay.

}
if apiequality.Semantic.DeepEqual(&roleSet.Status, status) {
if !orchestrationctrl.IsRoleSetReady(roleSet) {
klog.Infof("roleset %s/%s not ready, reconcile after %v seconds", roleSet.Namespace, roleSet.Name, DefaultRetryDelay)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current log message format will result in a slightly awkward output like ... reconcile after 1s seconds because %v on a time.Duration type includes the unit suffix (e.g., 's').

For better readability, I suggest removing the hardcoded "seconds" from the log string, as the unit is already part of the formatted variable.

Suggested change
klog.Infof("roleset %s/%s not ready, reconcile after %v seconds", roleSet.Namespace, roleSet.Name, DefaultRetryDelay)
klog.Infof("roleset %s/%s not ready, reconcile after %v", roleSet.Namespace, roleSet.Name, DefaultRetryDelay)

@Jeffwan Jeffwan merged commit 7740ee5 into vllm-project:main Aug 5, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants