Releases: BerriAI/litellm
v1.73.6.rc.1
What's Changed
- [⚡️ Python SDK Import] - 2 second faster import times by @ishaan-jaff in #12135
- 🧹 Refactor init.py to use a model registry by @ishaan-jaff in #12138
- Revert "🧹 Refactor init.py to use a model registry" by @ishaan-jaff in #12141
- [⚡️ Python SDK import] - reduce python sdk import time by .3s by @ishaan-jaff in #12140
/v1/messages
- Remove hardcoded model name on streaming + Tags - enable setting custom header tags by @krrishdholakia in #12131- UI QA Fixes - prevent team model reset on model add + return team-only models on /v2/model/info + render team member budget correctly by @krrishdholakia in #12144
Full Changelog: v1.73.6-nightly...v1.73.6.rc.1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.6.rc.1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 207.64485716560858 | 6.320692412920188 | 0.0 | 1890 | 0 | 167.53384199989796 | 1605.5699650000292 |
Aggregated | Passed ✅ | 190.0 | 207.64485716560858 | 6.320692412920188 | 0.0 | 1890 | 0 | 167.53384199989796 | 1605.5699650000292 |
v1.73.6-nightly
Full Changelog: v1.73.6.rc-draft...v1.73.6-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.6-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 210.86720644249243 | 6.277304583930227 | 0.0 | 1878 | 0 | 166.350910999995 | 6235.332546000023 |
Aggregated | Passed ✅ | 190.0 | 210.86720644249243 | 6.277304583930227 | 0.0 | 1878 | 0 | 166.350910999995 | 6235.332546000023 |
What's Changed
- fix(proxy): Fix test_mock_create_audio_file by adding managed_files hook by @colesmcintosh in #12072
- Enhance CircleCI integration in LLM translation testing workflow by @colesmcintosh in #12041
- Inkeep searchbar and chat added to the Docs by @NANDINI-star in #12030
- [Fix] Redis - Add better debugging to see what variables are set by @ishaan-jaff in #12073
- Fix today selector date mutation bug in dashboard components by @colesmcintosh in #12042
- Responses API - Add reasoning content support for non-OpenAI providers by @ryan-castner in #12055
- Raise clearer error on anthropic unified route + add new
new_key
param for regenerating key by @krrishdholakia in #12087 - Refactor: bedrock passthrough fixes - migrate to Passthrough SDK by @krrishdholakia in #12089
- Fix Azure-OpenAI Vision API Compliance by @davis-featherstone in #12075
- [Bug Fix] Bedrock Guardrails - Ensure PII Masking is applied on response streaming or non streaming content when using post call by @ishaan-jaff in #12086
- fix(docs): Remove unused dotenv dependency from docusaurus config by @colesmcintosh in #12102
- [Fix] MCP - Ensure internal users can access /mcp and /mcp/ routes by @ishaan-jaff in #12106
- fix: handle provider_config type error in passthrough error handler by @colesmcintosh in #12101
- Add o3 and o4-mini deep research models by @krrishdholakia in #12109
- [Bug Fix] Anthropic - Token Usage Null Handling in calculate_usage by @Gum-Joe in #12068
- fix: change cost calculation logs from INFO to DEBUG level by @colesmcintosh in #12112
- fix: set logger levels based on LITELLM_LOG environment variable by @colesmcintosh in #12111
- [Feat] Add Bridge from generateContent <> /chat/completions by @ishaan-jaff in #12081
- [Docs] - Show how to use fallbacks with audio transcriptions endpoints by @ishaan-jaff in #12115
- [Bug Fix] Fix handling str, bool types for
mock_testing_fallbacks
on router using /audio endpoints by @ishaan-jaff in #12117 - Adding Feature: Palo Alto Networks Prisma AIRS Guardrail by @jroberts2600 in #12116
- [Bug Fix] Exception mapping for context window exceeded - should catch anthropic exceptions by @ishaan-jaff in #12113
- docs(GEMINI.md): add development guidelines and architecture overview by @colesmcintosh in #12035
- [Bug fix] Router - handle cooldown_time = 0 for deployments by @ishaan-jaff in #12108
- [Feat] Add Eleven Labs - Speech To Text Support on LiteLLM by @ishaan-jaff in #12119
- Revert "fix: set logger levels based on LITELLM_LOG environment variable" by @ishaan-jaff in #12122
- Fix Braintrust integration: Adds model to metadata to calculate cost and corrects docs by @ohmeow in #12022
- [Fix] Change Message init type annotation to support other roles by @amarrella in #11942
- Add "Get Code" Feature by @NANDINI-star in #11629
- Bedrock Passthrough cost tracking (
/invoke
+/converse
routes - streaming + non-streaming) by @krrishdholakia in #12123 - feat: add local LLM translation testing with artifact generation by @colesmcintosh in #12120
- [Feat] introduce new environment variable NO_REDOC to opt-out Redoc by @zhangyoufu in #12092
- Fix user-team association issues in LiteLLM proxy by @colesmcintosh in #12082
- feat: enhance redaction functionality for EmbeddingResponse by @bougou in #12088
- De-duplicate models in team settings dropdown by @NANDINI-star in #12074
- Add Azure OpenAI assistant features cost tracking by @colesmcintosh in #12045
- Remove duplicated entry in logs on key cache update by @Mte90 in #12032
- Update model_prices_and_context_window.json by @codeugar in #11972
- Litellm batch api background cost calc by @krrishdholakia in #12125
- Selecting 'test connect' resets the public model name when selecting an azure model by @NANDINI-star in #11713
- [Bug Fix] Invite links email should contain the correct invite id by @ishaan-jaff in #12130
- fix example config.yaml in claude code tutorial by @glgh in #12133
New Contributors
- @ryan-castner made their first contribution in #12055
- @davis-featherstone made their first contribution in #12075
- @Gum-Joe made their first contribution in #12068
- @jroberts2600 made their first contribution in #12116
- @ohmeow made their first contribution in #12022
- @amarrella made their first contribution in #11942
- @zhangyoufu made their first contribution in #12092
- @bougou made their first contribution in #12088
- @codeugar made their first contribution in #11972
- @glgh made their first contribution in #12133
Full Changelog: v1.73.2-nightly...v1.73.6-nightly
v1.73.6.rc-draft
What's Changed
- Fix SambaNova 'created' field validation error - handle float timestamps by @neubig in #11971
- Docs - Add Recommended Machine Specifications by @ishaan-jaff in #11980
- fix: make response api support Azure Authentication method by @hsuyuming in #11941
- feat: add Last Success column to health check table by @colesmcintosh in #11903
- Add GitHub Actions workflow for LLM translation testing artifacts by @colesmcintosh in #11780
- Fix markdown table not rendering properly by @mukesh-dream11 in #11969
- [Fix] - Check HTTP_PROXY vars in networking requests by @ishaan-jaff in #11947
- Proxy UI MCP Auth passthrough by @wagnerjt in #11968
- fix unrecognised parameter reasoning_effort by @Shankyg in #11838
- Fixing watsonx error: 'model_id' or 'model' cannot be specified in the request body for models in a deployment space by @cbjuan in #11854
- [Bug Fix] Perplexity - LiteLLM doesn't support 'web_search_options' for Perplexity' Sonar Pro model by @ishaan-jaff in #11983
- feat: implement Perplexity citation tokens and search queries cost calculation by @colesmcintosh in #11938
- [Feat] Enterprise - Allow dynamically disabling callbacks in request headers by @ishaan-jaff in #11985
- Add Mistral 3.2 24B to model mapping by @colesmcintosh in #11926
- [Feat] Add List Callbacks API Endpoint by @ishaan-jaff in #11987
- fix: fix test_get_azure_ad_token_with_oidc_token testcase issue by @hsuyuming in #11988
- [Bug Fix] Bedrock Guardrail - Don't raise exception on intervene action by @ishaan-jaff in #11875
- VertexAI Anthropic passthrough cost calc fixes + Filter litellm params from request sent to passthrough endpoint by @krrishdholakia in #11992
- Fix custom pricing logging + Gemini - only use accepted format values + Gemini - cache tools if passing alongside cached content by @krrishdholakia in #11989
- Fix unpack_defs handling of nested $ref inside anyOf items by @colesmcintosh in #11964
- #response_format NVIDIA-NIM add response_format to OpenAI parameters … by @shagunb-acn in #12003
- Add Azure o3-pro Pricing by @marty-sullivan in #11990
- [Bug Fix] SCIM - Ensure new user roles are applied by @ishaan-jaff in #12015
- [Fix] Magistral small system prompt diverges too much from the official recommendation by @ishaan-jaff in #12007
- Refactor unpack_defs to use iterative approach instead of recursion by @colesmcintosh in #12017
- [Feat] Add OpenAI Search Vector Store Operation by @ishaan-jaff in #12018
- [Feat] OpenAI/Azure OpenAI - Add support for creating vector stores on LiteLLM by @ishaan-jaff in #12021
- docs(CLAUDE.md): add development guidance and architecture overview for Claude Code by @colesmcintosh in #12011
- Teams - Support default key expiry + UI - support enforcing access for members of specific SSO Group by @krrishdholakia in #12023
- Anthropic
/v1/messages
- Custom LLM Server support + Azure Responses api via chat completion support by @krrishdholakia in #12016 - Update mistral 'supports_response_schema' field + Fix ollama embedding by @krrishdholakia in #12024
- [Fix] Router - cooldown time, allow using dynamic cooldown time for a specific deployment by @ishaan-jaff in #12037
- Usage Page: Aggregate the data across all pages by @NANDINI-star in #12033
- [Feat] Add initial endpoints for using Gemini SDK (gemini-cli) with LiteLLM by @ishaan-jaff in #12040
- Add Elasticsearch Logging Tutorial by @colesmcintosh in #11761
- [Feat] Add Support for calling Gemini/Vertex models in their native format by @ishaan-jaff in #12046
- [Feat] Add gemini-cli support - call VertexAI models through LiteLLM Native gemini routes by @ishaan-jaff in #12053
- Managed Files + Batches - filter deployments to only those where file was written + save all model file id mappings in DB (prev just 1st one) by @krrishdholakia in #12048
- Filter team-only models from routing logic for non-team calls + Support List Batches with target model name specified by @krrishdholakia in #12049
- [Feat] gemini-cli integration - Add Logging + Cost tracking for stream + non-stream Vertex / Google AI Studio routes by @ishaan-jaff in #12058
- Fix Elasticsearch tutorial image rendering by @colesmcintosh in #12050
- [Fix] Allow using HTTP_ Proxy settings with trust_env by @ishaan-jaff in #12066
- fix(proxy): Fix test_mock_create_audio_file by adding managed_files hook by @colesmcintosh in #12072
- Enhance CircleCI integration in LLM translation testing workflow by @colesmcintosh in #12041
- Inkeep searchbar and chat added to the Docs by @NANDINI-star in #12030
- [Fix] Redis - Add better debugging to see what variables are set by @ishaan-jaff in #12073
- Fix today selector date mutation bug in dashboard components by @colesmcintosh in #12042
- Responses API - Add reasoning content support for non-OpenAI providers by @ryan-castner in #12055
- Litellm dev 06 26 2025 p1 by @krrishdholakia in #12087
- Refactor: bedrock passthrough fixes - migrate to Passthrough SDK by @krrishdholakia in #12089
- Fix Azure-OpenAI Vision API Compliance by @davis-featherstone in #12075
- [Bug Fix] Bedrock Guardrails - Ensure PII Masking is applied on response streaming or non streaming content when using post call by @ishaan-jaff in #12086
- fix(docs): Remove unused dotenv dependency from docusaurus config by @colesmcintosh in #12102
- [Fix] MCP - Ensure internal users can access /mcp and /mcp/ routes by @ishaan-jaff in #12106
- fix: handle provider_config type error in passthrough error handler by @colesmcintosh in #12101
- Add o3 and o4-mini deep research models by @krrishdholakia in #12109
- [Bug Fix] Anthropic - Token Usage Null Handling in calculate_usage by @Gum-Joe in #12068
- fix: change cost calculation logs from INFO to DEBUG level by @colesmcintosh in #12112
- fix: set logger levels based on LITELLM_LOG environment variable by @colesmcintosh in #12111
- [Feat] Add Bridge from generateContent <> /chat/completions by @ishaan-jaff in #12081
- [Docs] - Show how to use fallbacks with audio transcriptions endpoints by @ishaan-jaff in #12115
- [Bug Fix] Fix handling str, bool types for
mock_testing_fallbacks
on router using /audio endpoints by @ishaan-jaff in #12117 - Adding Feature: Palo Alto Networks Prisma AIRS Guardrail by @jroberts2600 in #12116
- [Bug Fix] Exception mapping for context window exceeded - should catch anthropic exceptions by @ishaan-jaff in #12113
- docs(GEMINI.md): add development guidelines and architecture overview by @colesmcintosh in #12035
- [Bug fix] Router - handle cooldown_time = 0 for deployments by @ishaan-jaff in #12108
- [Feat] Add Eleven Labs - Speech To Text Support on LiteLLM by @ishaan-jaff in #12119
- Revert "fix: set logger levels based on LITELLM_LOG environment variable" by @ishaan-jaff in #12122
- Fix Braintrust integration: Adds model to metadata to calculate cost and corrects docs by @ohmeow in #12022
- [Fix] Change Message init type annotation to support other roles by @amarrella in #11942
- Add "Get Code" Feature by @NANDINI-star in #11629
- Bedrock Passthrough cost tracking (
/invoke
+/converse
routes - streaming + non-streaming) by @krrishdholakia in #12123 - feat: ad...
v1.73.2-nightly
What's Changed
- VertexAI Anthropic passthrough cost calc fixes + Filter litellm params from request sent to passthrough endpoint by @krrishdholakia in #11992
- Fix custom pricing logging + Gemini - only use accepted format values + Gemini - cache tools if passing alongside cached content by @krrishdholakia in #11989
- Fix unpack_defs handling of nested $ref inside anyOf items by @colesmcintosh in #11964
- #response_format NVIDIA-NIM add response_format to OpenAI parameters … by @shagunb-acn in #12003
- Add Azure o3-pro Pricing by @marty-sullivan in #11990
- [Bug Fix] SCIM - Ensure new user roles are applied by @ishaan-jaff in #12015
- [Fix] Magistral small system prompt diverges too much from the official recommendation by @ishaan-jaff in #12007
- Refactor unpack_defs to use iterative approach instead of recursion by @colesmcintosh in #12017
- [Feat] Add OpenAI Search Vector Store Operation by @ishaan-jaff in #12018
- [Feat] OpenAI/Azure OpenAI - Add support for creating vector stores on LiteLLM by @ishaan-jaff in #12021
- docs(CLAUDE.md): add development guidance and architecture overview for Claude Code by @colesmcintosh in #12011
- Teams - Support default key expiry + UI - support enforcing access for members of specific SSO Group by @krrishdholakia in #12023
- Anthropic
/v1/messages
- Custom LLM Server support + Azure Responses api via chat completion support by @krrishdholakia in #12016 - Update mistral 'supports_response_schema' field + Fix ollama embedding by @krrishdholakia in #12024
- [Fix] Router - cooldown time, allow using dynamic cooldown time for a specific deployment by @ishaan-jaff in #12037
- Usage Page: Aggregate the data across all pages by @NANDINI-star in #12033
- [Feat] Add initial endpoints for using Gemini SDK (gemini-cli) with LiteLLM by @ishaan-jaff in #12040
- Add Elasticsearch Logging Tutorial by @colesmcintosh in #11761
- [Feat] Add Support for calling Gemini/Vertex models in their native format by @ishaan-jaff in #12046
- [Feat] Add gemini-cli support - call VertexAI models through LiteLLM Native gemini routes by @ishaan-jaff in #12053
- Managed Files + Batches - filter deployments to only those where file was written + save all model file id mappings in DB (prev just 1st one) by @krrishdholakia in #12048
- Filter team-only models from routing logic for non-team calls + Support List Batches with target model name specified by @krrishdholakia in #12049
- [Feat] gemini-cli integration - Add Logging + Cost tracking for stream + non-stream Vertex / Google AI Studio routes by @ishaan-jaff in #12058
- Fix Elasticsearch tutorial image rendering by @colesmcintosh in #12050
- [Fix] Allow using HTTP_ Proxy settings with trust_env by @ishaan-jaff in #12066
Full Changelog: v1.73.1-nightly...v1.73.2-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.2-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 210.0 | 227.43264090448105 | 6.195122505342355 | 0.0 | 1853 | 0 | 184.98149800007013 | 1558.9818880000053 |
Aggregated | Passed ✅ | 210.0 | 227.43264090448105 | 6.195122505342355 | 0.0 | 1853 | 0 | 184.98149800007013 | 1558.9818880000053 |
v1.73.0-stable
Full Changelog: v1.73.0.rc.1...v1.73.0-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.73.0-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 230.0 | 247.8644959408947 | 6.280622823577799 | 0.003344314602544089 | 1878 | 1 | 201.14030100000946 | 1471.8547979999812 |
Aggregated | Passed ✅ | 230.0 | 247.8644959408947 | 6.280622823577799 | 0.003344314602544089 | 1878 | 1 | 201.14030100000946 | 1471.8547979999812 |
v1.73.2.dev1
What's Changed
- VertexAI Anthropic passthrough cost calc fixes + Filter litellm params from request sent to passthrough endpoint by @krrishdholakia in #11992
- Fix custom pricing logging + Gemini - only use accepted format values + Gemini - cache tools if passing alongside cached content by @krrishdholakia in #11989
- Fix unpack_defs handling of nested $ref inside anyOf items by @colesmcintosh in #11964
- #response_format NVIDIA-NIM add response_format to OpenAI parameters … by @shagunb-acn in #12003
- Add Azure o3-pro Pricing by @marty-sullivan in #11990
- [Bug Fix] SCIM - Ensure new user roles are applied by @ishaan-jaff in #12015
Full Changelog: v1.73.1-nightly...v1.73.2.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.2.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 267.8382003869747 | 6.2096771619800935 | 0.0 | 1858 | 0 | 214.47131599995828 | 1466.6541370000346 |
Aggregated | Passed ✅ | 250.0 | 267.8382003869747 | 6.2096771619800935 | 0.0 | 1858 | 0 | 214.47131599995828 | 1466.6541370000346 |
v1.73.1-nightly
What's Changed
- Fix SambaNova 'created' field validation error - handle float timestamps by @neubig in #11971
- Docs - Add Recommended Machine Specifications by @ishaan-jaff in #11980
- fix: make response api support Azure Authentication method by @hsuyuming in #11941
- feat: add Last Success column to health check table by @colesmcintosh in #11903
- Add GitHub Actions workflow for LLM translation testing artifacts by @colesmcintosh in #11780
- Fix markdown table not rendering properly by @mukesh-dream11 in #11969
- [Fix] - Check HTTP_PROXY vars in networking requests by @ishaan-jaff in #11947
- Proxy UI MCP Auth passthrough by @wagnerjt in #11968
- fix unrecognised parameter reasoning_effort by @Shankyg in #11838
- Fixing watsonx error: 'model_id' or 'model' cannot be specified in the request body for models in a deployment space by @cbjuan in #11854
- [Bug Fix] Perplexity - LiteLLM doesn't support 'web_search_options' for Perplexity' Sonar Pro model by @ishaan-jaff in #11983
- feat: implement Perplexity citation tokens and search queries cost calculation by @colesmcintosh in #11938
- [Feat] Enterprise - Allow dynamically disabling callbacks in request headers by @ishaan-jaff in #11985
- Add Mistral 3.2 24B to model mapping by @colesmcintosh in #11926
- [Feat] Add List Callbacks API Endpoint by @ishaan-jaff in #11987
- fix: fix test_get_azure_ad_token_with_oidc_token testcase issue by @hsuyuming in #11988
- [Bug Fix] Bedrock Guardrail - Don't raise exception on intervene action by @ishaan-jaff in #11875
New Contributors
- @mukesh-dream11 made their first contribution in #11969
- @cbjuan made their first contribution in #11854
Full Changelog: v1.73.0.rc.1...v1.73.1-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.1-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.1-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 269.80153099125215 | 6.123419901585826 | 0.0 | 1829 | 0 | 217.6905329999954 | 1336.1768169999948 |
Aggregated | Passed ✅ | 250.0 | 269.80153099125215 | 6.123419901585826 | 0.0 | 1829 | 0 | 217.6905329999954 | 1336.1768169999948 |
v1.73.0.rc.1
What's Changed
- (Tutorial) Onboard Users for AI Exploration by @krrishdholakia in #11955
- Management Fixes - don't apply default internal user settings to admins + preserve all model access for teams with empty model list, when team model added + /v2/model/info fixes by @krrishdholakia in #11957
Full Changelog: v1.73.0-nightly...v1.73.0.rc.1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.0.rc.1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 220.0 | 240.58426726332922 | 6.145106667198675 | 0.0 | 1838 | 0 | 196.0181700000021 | 1838.010895000025 |
Aggregated | Passed ✅ | 220.0 | 240.58426726332922 | 6.145106667198675 | 0.0 | 1838 | 0 | 196.0181700000021 | 1838.010895000025 |
v1.73.0-nightly
What's Changed
- Update Azure o3 pricing to match OpenAI pricing ($2/$8 per 1M tokens) by @ervwalter in #11937
- [BugFix] Ollama response_format not working by @ThakeeNathees in #11880
- fix aws bedrock claude tool call index by @jnhyperion in #11842
- fix(acompletion): allow dict for tool_choice argument by @Jannchie in #11860
- [Chore] Check team counts on license when creating new team by @ishaan-jaff in #11943
- [Docs] [Pre-Release] v1.73.0-stable by @ishaan-jaff in #11950
- Show user all models they can call (Across teams) on UI by @krrishdholakia in #11948
New Contributors
- @ervwalter made their first contribution in #11937
- @ThakeeNathees made their first contribution in #11880
- @jnhyperion made their first contribution in #11842
- @Jannchie made their first contribution in #11860
Full Changelog: v1.72.9-nightly...v1.73.0-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.0-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.0-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 230.0 | 252.55233764162196 | 6.182759830384375 | 0.0 | 1850 | 0 | 208.6453730000244 | 1743.1928639999796 |
Aggregated | Passed ✅ | 230.0 | 252.55233764162196 | 6.182759830384375 | 0.0 | 1850 | 0 | 208.6453730000244 | 1743.1928639999796 |
v1.72.9-nightly
What's Changed
- [Feat] MCP - Allow connecting to MCP with authentication headers + Allow clients to specify MCP headers (#11890) by @ishaan-jaff in #11891
- [Fix] Networking - allow using CA Bundles by @ishaan-jaff in #11906
- [Feat] Add AWS Bedrock profiles for the APAC region by @lgruen-vcgs in #11883
- bumps the anthropic package by @rinormaloku in #11851
- Add deployment annotations by @InvisibleMan1306 in #11849
- Enhance Mistral API: Add support for parallel tool calls by @njbrake in #11770
- [UI] QA Items for adding pass through endpoints by @ishaan-jaff in #11909
- build(model_prices_and_context_window.json): mark all gemini-2.5 models support pdf input + Set anthropic custom llm provider property by @krrishdholakia in #11907
- fix(proxy_server.py): fix loading ui on custom root path by @krrishdholakia in #11912
- LiteLLM SDK <-> Proxy improvement (don't transform message client-side) + Bedrock - handle
qs:..
in base64 file data + Tag Management - support adding public model names by @krrishdholakia in #11908 - Add success modal for health check responses by @colesmcintosh in #11899
- Volcengine - thinking param support + Azure - handle more gpt custom naming patterns by @krrishdholakia in #11914
- [Feat] Model Cost Map - Add
gemini-2.5-pro
and setgemini-2.5-pro
supports_reasoning=True by @ishaan-jaff in #11927 - [Feat] UI Allow testing /v1/messages on the Test Key Page by @ishaan-jaff in #11930
- Feat/add delete callback by @jtong99 in #11654
- add ciphers in command and pass to hypercorn for proxy by @frankzye in #11916
- [Bug Fix] Fix model_group tracked for /v1/messages and /moderations by @ishaan-jaff in #11933
- [Bug Fix] Cost tracking and logging via the /v1/messages API are not working when using Claude Code by @ishaan-jaff in #11928
- [Feat] Add Azure Codex Models on LiteLLM + new /v1 preview Azure OpenAI API by @ishaan-jaff in #11934
- [Feat] UI QA: Pass through endpoints by @ishaan-jaff in #11939
New Contributors
- @lgruen-vcgs made their first contribution in #11883
- @rinormaloku made their first contribution in #11851
- @InvisibleMan1306 made their first contribution in #11849
Full Changelog: v1.72.7-nightly...v1.72.9-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.72.9-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 259.91299670392254 | 6.2187422072270495 | 0.0 | 1861 | 0 | 210.9276310000041 | 1676.9406920000165 |
Aggregated | Passed ✅ | 240.0 | 259.91299670392254 | 6.2187422072270495 | 0.0 | 1861 | 0 | 210.9276310000041 | 1676.9406920000165 |