Skip to content

Releases: BerriAI/litellm

v1.73.7-nightly

04 Jul 18:58
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.73.6.rc.1...v1.73.7-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.7-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 240.0 266.65155370887544 6.1803154475053885 0.0 1848 0 213.35606200000257 1776.5402100000074
Aggregated Passed ✅ 240.0 266.65155370887544 6.1803154475053885 0.0 1848 0 213.35606200000257 1776.5402100000074

v1.73.6.rc.2-nightly

04 Jul 01:24
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.73.6.rc-draft...v1.73.6.rc.2-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.6.rc.2-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.6.rc.2-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 240.0 253.81756881926228 6.177006286627148 0.0 1848 0 216.07973900000843 1514.1312829999833
Aggregated Passed ✅ 240.0 253.81756881926228 6.177006286627148 0.0 1848 0 216.07973900000843 1514.1312829999833

v1.73.6-stable

04 Jul 01:23
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.73.6.rc-draft...v1.73.6-stable

v1.73.7.dev1

02 Jul 01:21
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.73.6.rc.1...v1.73.7.dev1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.7.dev1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 240.0 263.64579985844557 6.232313439052863 0.0 1865 0 210.2366830000051 1877.0044299999995
Aggregated Passed ✅ 240.0 263.64579985844557 6.232313439052863 0.0 1865 0 210.2366830000051 1877.0044299999995

v1.73.0.debug_mem

02 Jul 15:51
Compare
Choose a tag to compare

Full Changelog: v1.73.0-stable...v1.73.0.debug_mem

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.0.debug_mem

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 230.0 248.77890883802746 6.167721429610798 0.0 1846 0 196.06082799998603 1224.1804999999886
Aggregated Passed ✅ 230.0 248.77890883802746 6.167721429610798 0.0 1846 0 196.06082799998603 1224.1804999999886

v1.73.6.rc.1

29 Jun 06:19
Compare
Choose a tag to compare

What's Changed

  • [⚡️ Python SDK Import] - 2 second faster import times by @ishaan-jaff in #12135
  • 🧹 Refactor init.py to use a model registry by @ishaan-jaff in #12138
  • Revert "🧹 Refactor init.py to use a model registry" by @ishaan-jaff in #12141
  • [⚡️ Python SDK import] - reduce python sdk import time by .3s by @ishaan-jaff in #12140
  • /v1/messages - Remove hardcoded model name on streaming + Tags - enable setting custom header tags by @krrishdholakia in #12131
  • UI QA Fixes - prevent team model reset on model add + return team-only models on /v2/model/info + render team member budget correctly by @krrishdholakia in #12144

Full Changelog: v1.73.6-nightly...v1.73.6.rc.1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.6.rc.1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 190.0 207.64485716560858 6.320692412920188 0.0 1890 0 167.53384199989796 1605.5699650000292
Aggregated Passed ✅ 190.0 207.64485716560858 6.320692412920188 0.0 1890 0 167.53384199989796 1605.5699650000292

v1.73.6-nightly

28 Jun 20:54
Compare
Choose a tag to compare

Full Changelog: v1.73.6.rc-draft...v1.73.6-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.6-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 190.0 210.86720644249243 6.277304583930227 0.0 1878 0 166.350910999995 6235.332546000023
Aggregated Passed ✅ 190.0 210.86720644249243 6.277304583930227 0.0 1878 0 166.350910999995 6235.332546000023

What's Changed

New Contributors

Full Changelog: v1.73.2-nightly...v1.73.6-nightly

v1.73.6.rc-draft

28 Jun 18:09
2784935
Compare
Choose a tag to compare
v1.73.6.rc-draft Pre-release
Pre-release

What's Changed

  • Fix SambaNova 'created' field validation error - handle float timestamps by @neubig in #11971
  • Docs - Add Recommended Machine Specifications by @ishaan-jaff in #11980
  • fix: make response api support Azure Authentication method by @hsuyuming in #11941
  • feat: add Last Success column to health check table by @colesmcintosh in #11903
  • Add GitHub Actions workflow for LLM translation testing artifacts by @colesmcintosh in #11780
  • Fix markdown table not rendering properly by @mukesh-dream11 in #11969
  • [Fix] - Check HTTP_PROXY vars in networking requests by @ishaan-jaff in #11947
  • Proxy UI MCP Auth passthrough by @wagnerjt in #11968
  • fix unrecognised parameter reasoning_effort by @Shankyg in #11838
  • Fixing watsonx error: 'model_id' or 'model' cannot be specified in the request body for models in a deployment space by @cbjuan in #11854
  • [Bug Fix] Perplexity - LiteLLM doesn't support 'web_search_options' for Perplexity' Sonar Pro model by @ishaan-jaff in #11983
  • feat: implement Perplexity citation tokens and search queries cost calculation by @colesmcintosh in #11938
  • [Feat] Enterprise - Allow dynamically disabling callbacks in request headers by @ishaan-jaff in #11985
  • Add Mistral 3.2 24B to model mapping by @colesmcintosh in #11926
  • [Feat] Add List Callbacks API Endpoint by @ishaan-jaff in #11987
  • fix: fix test_get_azure_ad_token_with_oidc_token testcase issue by @hsuyuming in #11988
  • [Bug Fix] Bedrock Guardrail - Don't raise exception on intervene action by @ishaan-jaff in #11875
  • VertexAI Anthropic passthrough cost calc fixes + Filter litellm params from request sent to passthrough endpoint by @krrishdholakia in #11992
  • Fix custom pricing logging + Gemini - only use accepted format values + Gemini - cache tools if passing alongside cached content by @krrishdholakia in #11989
  • Fix unpack_defs handling of nested $ref inside anyOf items by @colesmcintosh in #11964
  • #response_format NVIDIA-NIM add response_format to OpenAI parameters … by @shagunb-acn in #12003
  • Add Azure o3-pro Pricing by @marty-sullivan in #11990
  • [Bug Fix] SCIM - Ensure new user roles are applied by @ishaan-jaff in #12015
  • [Fix] Magistral small system prompt diverges too much from the official recommendation by @ishaan-jaff in #12007
  • Refactor unpack_defs to use iterative approach instead of recursion by @colesmcintosh in #12017
  • [Feat] Add OpenAI Search Vector Store Operation by @ishaan-jaff in #12018
  • [Feat] OpenAI/Azure OpenAI - Add support for creating vector stores on LiteLLM by @ishaan-jaff in #12021
  • docs(CLAUDE.md): add development guidance and architecture overview for Claude Code by @colesmcintosh in #12011
  • Teams - Support default key expiry + UI - support enforcing access for members of specific SSO Group by @krrishdholakia in #12023
  • Anthropic /v1/messages - Custom LLM Server support + Azure Responses api via chat completion support by @krrishdholakia in #12016
  • Update mistral 'supports_response_schema' field + Fix ollama embedding by @krrishdholakia in #12024
  • [Fix] Router - cooldown time, allow using dynamic cooldown time for a specific deployment by @ishaan-jaff in #12037
  • Usage Page: Aggregate the data across all pages by @NANDINI-star in #12033
  • [Feat] Add initial endpoints for using Gemini SDK (gemini-cli) with LiteLLM by @ishaan-jaff in #12040
  • Add Elasticsearch Logging Tutorial by @colesmcintosh in #11761
  • [Feat] Add Support for calling Gemini/Vertex models in their native format by @ishaan-jaff in #12046
  • [Feat] Add gemini-cli support - call VertexAI models through LiteLLM Native gemini routes by @ishaan-jaff in #12053
  • Managed Files + Batches - filter deployments to only those where file was written + save all model file id mappings in DB (prev just 1st one) by @krrishdholakia in #12048
  • Filter team-only models from routing logic for non-team calls + Support List Batches with target model name specified by @krrishdholakia in #12049
  • [Feat] gemini-cli integration - Add Logging + Cost tracking for stream + non-stream Vertex / Google AI Studio routes by @ishaan-jaff in #12058
  • Fix Elasticsearch tutorial image rendering by @colesmcintosh in #12050
  • [Fix] Allow using HTTP_ Proxy settings with trust_env by @ishaan-jaff in #12066
  • fix(proxy): Fix test_mock_create_audio_file by adding managed_files hook by @colesmcintosh in #12072
  • Enhance CircleCI integration in LLM translation testing workflow by @colesmcintosh in #12041
  • Inkeep searchbar and chat added to the Docs by @NANDINI-star in #12030
  • [Fix] Redis - Add better debugging to see what variables are set by @ishaan-jaff in #12073
  • Fix today selector date mutation bug in dashboard components by @colesmcintosh in #12042
  • Responses API - Add reasoning content support for non-OpenAI providers by @ryan-castner in #12055
  • Litellm dev 06 26 2025 p1 by @krrishdholakia in #12087
  • Refactor: bedrock passthrough fixes - migrate to Passthrough SDK by @krrishdholakia in #12089
  • Fix Azure-OpenAI Vision API Compliance by @davis-featherstone in #12075
  • [Bug Fix] Bedrock Guardrails - Ensure PII Masking is applied on response streaming or non streaming content when using post call by @ishaan-jaff in #12086
  • fix(docs): Remove unused dotenv dependency from docusaurus config by @colesmcintosh in #12102
  • [Fix] MCP - Ensure internal users can access /mcp and /mcp/ routes by @ishaan-jaff in #12106
  • fix: handle provider_config type error in passthrough error handler by @colesmcintosh in #12101
  • Add o3 and o4-mini deep research models by @krrishdholakia in #12109
  • [Bug Fix] Anthropic - Token Usage Null Handling in calculate_usage by @Gum-Joe in #12068
  • fix: change cost calculation logs from INFO to DEBUG level by @colesmcintosh in #12112
  • fix: set logger levels based on LITELLM_LOG environment variable by @colesmcintosh in #12111
  • [Feat] Add Bridge from generateContent <> /chat/completions by @ishaan-jaff in #12081
  • [Docs] - Show how to use fallbacks with audio transcriptions endpoints by @ishaan-jaff in #12115
  • [Bug Fix] Fix handling str, bool types formock_testing_fallbacks on router using /audio endpoints by @ishaan-jaff in #12117
  • Adding Feature: Palo Alto Networks Prisma AIRS Guardrail by @jroberts2600 in #12116
  • [Bug Fix] Exception mapping for context window exceeded - should catch anthropic exceptions by @ishaan-jaff in #12113
  • docs(GEMINI.md): add development guidelines and architecture overview by @colesmcintosh in #12035
  • [Bug fix] Router - handle cooldown_time = 0 for deployments by @ishaan-jaff in #12108
  • [Feat] Add Eleven Labs - Speech To Text Support on LiteLLM by @ishaan-jaff in #12119
  • Revert "fix: set logger levels based on LITELLM_LOG environment variable" by @ishaan-jaff in #12122
  • Fix Braintrust integration: Adds model to metadata to calculate cost and corrects docs by @ohmeow in #12022
  • [Fix] Change Message init type annotation to support other roles by @amarrella in #11942
  • Add "Get Code" Feature by @NANDINI-star in #11629
  • Bedrock Passthrough cost tracking (/invoke + /converse routes - streaming + non-streaming) by @krrishdholakia in #12123
  • feat: ad...
Read more

v1.73.2-nightly

26 Jun 20:06
22ff3da
Compare
Choose a tag to compare

What's Changed

  • VertexAI Anthropic passthrough cost calc fixes + Filter litellm params from request sent to passthrough endpoint by @krrishdholakia in #11992
  • Fix custom pricing logging + Gemini - only use accepted format values + Gemini - cache tools if passing alongside cached content by @krrishdholakia in #11989
  • Fix unpack_defs handling of nested $ref inside anyOf items by @colesmcintosh in #11964
  • #response_format NVIDIA-NIM add response_format to OpenAI parameters … by @shagunb-acn in #12003
  • Add Azure o3-pro Pricing by @marty-sullivan in #11990
  • [Bug Fix] SCIM - Ensure new user roles are applied by @ishaan-jaff in #12015
  • [Fix] Magistral small system prompt diverges too much from the official recommendation by @ishaan-jaff in #12007
  • Refactor unpack_defs to use iterative approach instead of recursion by @colesmcintosh in #12017
  • [Feat] Add OpenAI Search Vector Store Operation by @ishaan-jaff in #12018
  • [Feat] OpenAI/Azure OpenAI - Add support for creating vector stores on LiteLLM by @ishaan-jaff in #12021
  • docs(CLAUDE.md): add development guidance and architecture overview for Claude Code by @colesmcintosh in #12011
  • Teams - Support default key expiry + UI - support enforcing access for members of specific SSO Group by @krrishdholakia in #12023
  • Anthropic /v1/messages - Custom LLM Server support + Azure Responses api via chat completion support by @krrishdholakia in #12016
  • Update mistral 'supports_response_schema' field + Fix ollama embedding by @krrishdholakia in #12024
  • [Fix] Router - cooldown time, allow using dynamic cooldown time for a specific deployment by @ishaan-jaff in #12037
  • Usage Page: Aggregate the data across all pages by @NANDINI-star in #12033
  • [Feat] Add initial endpoints for using Gemini SDK (gemini-cli) with LiteLLM by @ishaan-jaff in #12040
  • Add Elasticsearch Logging Tutorial by @colesmcintosh in #11761
  • [Feat] Add Support for calling Gemini/Vertex models in their native format by @ishaan-jaff in #12046
  • [Feat] Add gemini-cli support - call VertexAI models through LiteLLM Native gemini routes by @ishaan-jaff in #12053
  • Managed Files + Batches - filter deployments to only those where file was written + save all model file id mappings in DB (prev just 1st one) by @krrishdholakia in #12048
  • Filter team-only models from routing logic for non-team calls + Support List Batches with target model name specified by @krrishdholakia in #12049
  • [Feat] gemini-cli integration - Add Logging + Cost tracking for stream + non-stream Vertex / Google AI Studio routes by @ishaan-jaff in #12058
  • Fix Elasticsearch tutorial image rendering by @colesmcintosh in #12050
  • [Fix] Allow using HTTP_ Proxy settings with trust_env by @ishaan-jaff in #12066

Full Changelog: v1.73.1-nightly...v1.73.2-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.2-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 210.0 227.43264090448105 6.195122505342355 0.0 1853 0 184.98149800007013 1558.9818880000053
Aggregated Passed ✅ 210.0 227.43264090448105 6.195122505342355 0.0 1853 0 184.98149800007013 1558.9818880000053

v1.73.0-stable

26 Jun 23:07
Compare
Choose a tag to compare

Full Changelog: v1.73.0.rc.1...v1.73.0-stable

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.73.0-stable

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 230.0 247.8644959408947 6.280622823577799 0.003344314602544089 1878 1 201.14030100000946 1471.8547979999812
Aggregated Passed ✅ 230.0 247.8644959408947 6.280622823577799 0.003344314602544089 1878 1 201.14030100000946 1471.8547979999812