-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
[Bug fix] Router - handle cooldown_time = 0 for deployments #12108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
A bug fix to ensure per-deployment cooldown_time = 0
bypasses default cooldown behavior and dynamic values are correctly propagated.
- Add
time_to_cooldown
parameter and early exit when it’s zero - Propagate dynamic cooldown times through handlers and cache
- Update tests to cover zero, near-zero,
None
, and positive cooldown logic
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
tests/local_testing/test_router_cooldowns.py | Added tests to verify zero, near-zero, None, and positive cooldown behavior |
litellm/router_utils/cooldown_handlers.py | Introduced time_to_cooldown , zero‐value early exit, and propagate to cache and callbacks |
litellm/router_utils/cooldown_callbacks.py | Updated callback signature to accept Optional[float] cooldown_time |
litellm/router_utils/cooldown_cache.py | Improved resolution between dynamic cooldown and default cooldown |
Comments suppressed due to low confidence (3)
litellm/router_utils/cooldown_handlers.py:140
- This check is redundant because an earlier check already handles the case where deployment is None; consider removing the duplicate.
if deployment is None:
litellm/router_utils/cooldown_handlers.py:101
- [nitpick] Consider renaming the parameter
time_to_cooldown
tocooldown_time
to align with the terminology used elsewhere in the codebase.
time_to_cooldown: Optional[float] = None,
tests/local_testing/test_router_cooldowns.py:164
- Add a test to verify that when a positive
cooldown_time
is provided inlitellm_params
, the cache is called with that value rather than the default.
# Also verify the deployment is not in cooldown
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
[Bug fix] Router - handle cooldown_time = 0 for deployments
If a deployments cooldown is set to 0.00 or 0 then don't run cooldown logic for that specific deployment
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/
directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit
Type
🐛 Bug Fix
✅ Test
Changes