Add OpenAI o3 & o4-mini #10065

PeterDaveHello · 2025-04-16T17:40:19Z

Title

Add OpenAI o3 & 4o-mini

Reference:

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
I have added a screenshot of my new test passing locally
My PR passes all unit tests on (make test-unit)[https://docs.litellm.ai/docs/extras/contributing_code]
My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🆕 New Feature

Changes

Added:

"o3"
"o3-2025-04-16"
"o4-mini"
"o4-mini-2025-04-16"

vercel · 2025-04-16T17:40:24Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
litellm	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Apr 16, 2025 6:29pm

PeterDaveHello · 2025-04-16T17:45:28Z

cc @krrishdholakia @ishaan-jaff

krrishdholakia · 2025-04-16T18:18:01Z

litellm/model_prices_and_context_window_backup.json

+        "max_tokens": 100000,
+        "max_input_tokens": 200000,
+        "max_output_tokens": 100000,
+        "input_cost_per_token": 0.00001,


thanks @PeterDaveHello can you please use scientific notation (e.g. 1e-6) and mention the supported_endpoints, modalities, output_modalities for these models?

Certainly, I’m happy to help with that. Could you also please tell me when I should use scientific notation here?

Reference: - https://platform.openai.com/docs/models/o3 - https://platform.openai.com/docs/models/o4-mini

elabbarw · 2025-04-16T19:35:09Z

Hey guys, can you please include azure/ models as well? Always assume there's an Azure-hosted model available shortly after OpenAI 😄

krrishdholakia · 2025-04-16T19:40:32Z

Hey @elabbarw good point - do you see the pricing for the azure models though?

PeterDaveHello · 2025-04-16T19:46:34Z

Their list price should be the same.

elabbarw · 2025-04-16T20:05:43Z

Prices have historically always been the same. So far i've got the 4.1 models live in Sweden Central and US regions. I'm expecting to see the new O models soon.

jrkropp · 2025-04-16T21:02:46Z

Also ensuring that they support the 'reasoning_effort'. I added the models manually and get the error below when trying to use reasoning_effort. I'm creating a model called o4-mini-high which is technically just o4-mini with the reasoning set to high.

400: litellm.UnsupportedParamsError: openai does not support parameters: ['reasoning_effort'], for model=o4-mini. To drop these, set litellm.drop_params=True or for proxy:

litellm_settings: drop_params: true
.
If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.. Received Model Group=o4-mini-high
Available Model Group Fallbacks=None

krrishdholakia · 2025-04-16T22:18:37Z

Ack @wagnerjt
Hey @jrkropp i think we need to update our o-series check (currently explicitly checking 'o1' and 'o3' in str)

Reference: - https://platform.openai.com/docs/models/o3 - https://platform.openai.com/docs/models/o4-mini

miguelwon · 2025-04-17T16:37:40Z

Hi,
I'm having the error I show bellow with "o3", only when stream=True. My organization is verified and if I make a request using openai package it works ok. I'm working with the latest version 1.66.2

from litellm import completion
response = completion(
    model="o3",
    messages=[{"role":"user","content":"Hello"}],
    stream=True
)
for chunk in response:
    print(chunk)

---------------------------------------------------------------------------
BadRequestError                           Traceback (most recent call last)
File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/litellm/llms/openai/openai.py:724, in OpenAIChatCompletion.completion(self, model_response, timeout, optional_params, litellm_params, logging_obj, model, messages, print_verbose, api_key, api_base, api_version, dynamic_params, azure_ad_token, acompletion, logger_fn, headers, custom_prompt_dict, client, organization, custom_llm_provider, drop_params)
    723             else:
--> 724                 raise e
    725 except OpenAIError as e:

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/litellm/llms/openai/openai.py:607, in OpenAIChatCompletion.completion(self, model_response, timeout, optional_params, litellm_params, logging_obj, model, messages, print_verbose, api_key, api_base, api_version, dynamic_params, azure_ad_token, acompletion, logger_fn, headers, custom_prompt_dict, client, organization, custom_llm_provider, drop_params)
    606 elif stream is True and fake_stream is False:
--> 607     return self.streaming(
    608         logging_obj=logging_obj,
    609         headers=headers,
    610         data=data,
    611         model=model,
    612         api_base=api_base,
    613         api_key=api_key,
    614         api_version=api_version,
    615         timeout=timeout,
    616         client=client,
    617         max_retries=max_retries,
    618         organization=organization,
    619         stream_options=stream_options,
    620     )
    621 else:

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/litellm/llms/openai/openai.py:885, in OpenAIChatCompletion.streaming(self, logging_obj, timeout, data, model, api_key, api_base, api_version, organization, client, max_retries, headers, stream_options)
    875 logging_obj.pre_call(
    876     input=data["messages"],
    877     api_key=api_key,
   (...)
    883     },
    884 )
--> 885 headers, response = self.make_sync_openai_chat_completion_request(
    886     openai_client=openai_client,
    887     data=data,
    888     timeout=timeout,
    889     logging_obj=logging_obj,
    890 )
    892 logging_obj.model_call_details["response_headers"] = headers

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/litellm/litellm_core_utils/logging_utils.py:149, in track_llm_api_timing.<locals>.decorator.<locals>.sync_wrapper(*args, **kwargs)
    148 try:
--> 149     result = func(*args, **kwargs)
    150     return result

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/litellm/llms/openai/openai.py:471, in OpenAIChatCompletion.make_sync_openai_chat_completion_request(self, openai_client, data, timeout, logging_obj)
    470 else:
--> 471     raise e

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/litellm/llms/openai/openai.py:453, in OpenAIChatCompletion.make_sync_openai_chat_completion_request(self, openai_client, data, timeout, logging_obj)
    452 try:
--> 453     raw_response = openai_client.chat.completions.with_raw_response.create(
    454         **data, timeout=timeout
    455     )
    457     if hasattr(raw_response, "headers"):

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/openai/_legacy_response.py:364, in to_raw_response_wrapper.<locals>.wrapped(*args, **kwargs)
    362 kwargs["extra_headers"] = extra_headers
--> 364 return cast(LegacyAPIResponse[R], func(*args, **kwargs))

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/openai/_utils/_utils.py:279, in required_args.<locals>.inner.<locals>.wrapper(*args, **kwargs)
    278     raise TypeError(msg)
--> 279 return func(*args, **kwargs)

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/openai/resources/chat/completions/completions.py:914, in Completions.create(self, messages, model, audio, frequency_penalty, function_call, functions, logit_bias, logprobs, max_completion_tokens, max_tokens, metadata, modalities, n, parallel_tool_calls, prediction, presence_penalty, reasoning_effort, response_format, seed, service_tier, stop, store, stream, stream_options, temperature, tool_choice, tools, top_logprobs, top_p, user, web_search_options, extra_headers, extra_query, extra_body, timeout)
    913 validate_response_format(response_format)
--> 914 return self._post(
    915     "/chat/completions",
    916     body=maybe_transform(
    917         {
    918             "messages": messages,
    919             "model": model,
    920             "audio": audio,
    921             "frequency_penalty": frequency_penalty,
    922             "function_call": function_call,
    923             "functions": functions,
    924             "logit_bias": logit_bias,
    925             "logprobs": logprobs,
    926             "max_completion_tokens": max_completion_tokens,
    927             "max_tokens": max_tokens,
    928             "metadata": metadata,
    929             "modalities": modalities,
    930             "n": n,
    931             "parallel_tool_calls": parallel_tool_calls,
    932             "prediction": prediction,
    933             "presence_penalty": presence_penalty,
    934             "reasoning_effort": reasoning_effort,
    935             "response_format": response_format,
    936             "seed": seed,
    937             "service_tier": service_tier,
    938             "stop": stop,
    939             "store": store,
    940             "stream": stream,
    941             "stream_options": stream_options,
    942             "temperature": temperature,
    943             "tool_choice": tool_choice,
    944             "tools": tools,
    945             "top_logprobs": top_logprobs,
    946             "top_p": top_p,
    947             "user": user,
    948             "web_search_options": web_search_options,
    949         },
    950         completion_create_params.CompletionCreateParams,
    951     ),
    952     options=make_request_options(
    953         extra_headers=extra_headers, extra_query=extra_query, extra_body=extra_body, timeout=timeout
    954     ),
    955     cast_to=ChatCompletion,
    956     stream=stream or False,
    957     stream_cls=Stream[ChatCompletionChunk],
    958 )

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/openai/_base_client.py:1242, in SyncAPIClient.post(self, path, cast_to, body, options, files, stream, stream_cls)
   1239 opts = FinalRequestOptions.construct(
   1240     method="post", url=path, json_data=body, files=to_httpx_files(files), **options
   1241 )
-> 1242 return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/openai/_base_client.py:919, in SyncAPIClient.request(self, cast_to, options, remaining_retries, stream, stream_cls)
    917     retries_taken = 0
--> 919 return self._request(
    920     cast_to=cast_to,
    921     options=options,
    922     stream=stream,
    923     stream_cls=stream_cls,
    924     retries_taken=retries_taken,
    925 )

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/openai/_base_client.py:1023, in SyncAPIClient._request(self, cast_to, options, retries_taken, stream, stream_cls)
   1022     log.debug("Re-raising status error")
-> 1023     raise self._make_status_error_from_response(err.response) from None
   1025 return self._process_response(
   1026     cast_to=cast_to,
   1027     options=options,
   (...)
   1031     retries_taken=retries_taken,
   1032 )

BadRequestError: Error code: 400 - {'error': {'message': 'Your organization must be verified to stream this model. Please go to: https://platform.openai.com/settings/organization/general and click on Verify Organization.', 'type': 'invalid_request_error', 'param': 'stream', 'code': 'unsupported_value'}}

During handling of the above exception, another exception occurred:

OpenAIError                               Traceback (most recent call last)
File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/litellm/main.py:1765, in completion(model, messages, timeout, temperature, top_p, n, stream, stream_options, stop, max_completion_tokens, max_tokens, modalities, prediction, audio, presence_penalty, frequency_penalty, logit_bias, user, reasoning_effort, response_format, seed, tools, tool_choice, logprobs, top_logprobs, parallel_tool_calls, deployment_id, extra_headers, functions, function_call, base_url, api_version, api_key, model_list, thinking, **kwargs)
   1759     logging.post_call(
   1760         input=messages,
   1761         api_key=api_key,
   1762         original_response=str(e),
   1763         additional_args={"headers": headers},
   1764     )
-> 1765     raise e
   1767 if optional_params.get("stream", False):
   1768     ## LOGGING

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/litellm/main.py:1738, in completion(model, messages, timeout, temperature, top_p, n, stream, stream_options, stop, max_completion_tokens, max_tokens, modalities, prediction, audio, presence_penalty, frequency_penalty, logit_bias, user, reasoning_effort, response_format, seed, tools, tool_choice, logprobs, top_logprobs, parallel_tool_calls, deployment_id, extra_headers, functions, function_call, base_url, api_version, api_key, model_list, thinking, **kwargs)
   1737 try:
-> 1738     response = openai_chat_completions.completion(
   1739         model=model,
   1740         messages=messages,
   1741         headers=headers,
   1742         model_response=model_response,
   1743         print_verbose=print_verbose,
   1744         api_key=api_key,
   1745         api_base=api_base,
   1746         acompletion=acompletion,
   1747         logging_obj=logging,
   1748         optional_params=optional_params,
   1749         litellm_params=litellm_params,
   1750         logger_fn=logger_fn,
   1751         timeout=timeout,  # type: ignore
   1752         custom_prompt_dict=custom_prompt_dict,
   1753         client=client,  # pass AsyncOpenAI, OpenAI client
   1754         organization=organization,
   1755         custom_llm_provider=custom_llm_provider,
   1756     )
   1757 except Exception as e:
   1758     ## LOGGING - log the original exception returned

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/litellm/llms/openai/openai.py:735, in OpenAIChatCompletion.completion(self, model_response, timeout, optional_params, litellm_params, logging_obj, model, messages, print_verbose, api_key, api_base, api_version, dynamic_params, azure_ad_token, acompletion, logger_fn, headers, custom_prompt_dict, client, organization, custom_llm_provider, drop_params)
    734     error_headers = getattr(error_response, "headers", None)
--> 735 raise OpenAIError(
    736     status_code=status_code,
    737     message=error_text,
    738     headers=error_headers,
    739     body=error_body,
    740 )

OpenAIError: Error code: 400 - {'error': {'message': 'Your organization must be verified to stream this model. Please go to: https://platform.openai.com/settings/organization/general and click on Verify Organization.', 'type': 'invalid_request_error', 'param': 'stream', 'code': 'unsupported_value'}}

During handling of the above exception, another exception occurred:

BadRequestError                           Traceback (most recent call last)
Cell In[31], line 2
      1 from litellm import completion
----> 2 response = completion(
      3     model="o3",
      4     messages=[{"role":"user","content":"Hello"}],
      5     stream=True
      6 )
      7 for chunk in response:
      8     print(chunk)

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/litellm/utils.py:1247, in client.<locals>.wrapper(*args, **kwargs)
   1243 if logging_obj:
   1244     logging_obj.failure_handler(
   1245         e, traceback_exception, start_time, end_time
   1246     )  # DO NOT MAKE THREADED - router retry fallback relies on this!
-> 1247 raise e

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/litellm/utils.py:1125, in client.<locals>.wrapper(*args, **kwargs)
   1123         print_verbose(f"Error while checking max token limit: {str(e)}")
   1124 # MODEL CALL
-> 1125 result = original_function(*args, **kwargs)
   1126 end_time = datetime.datetime.now()
   1127 if "stream" in kwargs and kwargs["stream"] is True:

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/litellm/main.py:3150, in completion(model, messages, timeout, temperature, top_p, n, stream, stream_options, stop, max_completion_tokens, max_tokens, modalities, prediction, audio, presence_penalty, frequency_penalty, logit_bias, user, reasoning_effort, response_format, seed, tools, tool_choice, logprobs, top_logprobs, parallel_tool_calls, deployment_id, extra_headers, functions, function_call, base_url, api_version, api_key, model_list, thinking, **kwargs)
   3147     return response
   3148 except Exception as e:
   3149     ## Map to OpenAI Exception
-> 3150     raise exception_type(
   3151         model=model,
   3152         custom_llm_provider=custom_llm_provider,
   3153         original_exception=e,
   3154         completion_kwargs=args,
   3155         extra_kwargs=kwargs,
   3156     )

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py:2214, in exception_type(model, original_exception, custom_llm_provider, completion_kwargs, extra_kwargs)
   2212 if exception_mapping_worked:
   2213     setattr(e, "litellm_response_headers", litellm_response_headers)
-> 2214     raise e
   2215 else:
   2216     for error_type in litellm.LITELLM_EXCEPTION_TYPES:

File ~/opt/miniconda3/envs/dsta/lib/python3.10/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py:328, in exception_type(model, original_exception, custom_llm_provider, completion_kwargs, extra_kwargs)
    323 elif (
    324     "invalid_request_error" in error_str
    325     and "Incorrect API key provided" not in error_str
    326 ):
    327     exception_mapping_worked = True
--> 328     raise BadRequestError(
    329         message=f"{exception_provider} - {message}",
    330         llm_provider=custom_llm_provider,
    331         model=model,
    332         response=getattr(original_exception, "response", None),
    333         litellm_debug_info=extra_information,
    334         body=getattr(original_exception, "body", None),
    335     )
    336 elif (
    337     "Web server is returning an unknown error" in error_str
    338     or "The server had an error processing your request." in error_str
    339 ):
    340     exception_mapping_worked = True

BadRequestError: litellm.BadRequestError: OpenAIException - Your organization must be verified to stream this model. Please go to: https://platform.openai.com/settings/organization/general and click on Verify Organization.

krrishdholakia · 2025-04-17T16:39:41Z

Hey @miguelwon that seems like an OpenAI error. litellm._turn_on_debug() should show you the raw request made

miguelwon · 2025-04-17T16:49:09Z

Thanks @krrishdholakia. It is working now.
Yes, very likely their side. I just did the organization verification and so perhaps some delay in permissions update.

valda-z · 2025-04-17T21:01:35Z

please can you fix it also for azure? because there is same logic in file o_series_transformation.py like for openai.

Fixes #10065 (comment)

krrishdholakia · 2025-04-18T05:58:17Z

Ack @valda-z Fix pushed on main

valda-z · 2025-04-18T16:31:16Z

Thanks! Any plans to public new version for pip?

krrishdholakia · 2025-04-18T17:43:17Z

Yes - hopefully today

valda-z · 2025-04-18T17:44:28Z

Wow :) !!

vercel bot deployed to Preview April 16, 2025 17:41 View deployment

PeterDaveHello force-pushed the OpenAI-o3-and-4o-mini branch from 4060a74 to 4db11f3 Compare April 16, 2025 17:41

vercel bot deployed to Preview April 16, 2025 17:42 View deployment

krrishdholakia reviewed Apr 16, 2025

View reviewed changes

Add OpenAI o3 & 4o-mini

d2c611b

Reference: - https://platform.openai.com/docs/models/o3 - https://platform.openai.com/docs/models/o4-mini

PeterDaveHello force-pushed the OpenAI-o3-and-4o-mini branch from 4db11f3 to d2c611b Compare April 16, 2025 18:28

PeterDaveHello requested a review from krrishdholakia April 16, 2025 18:28

vercel bot deployed to Preview April 16, 2025 18:29 View deployment

krrishdholakia merged commit 5c078af into BerriAI:main Apr 16, 2025
5 checks passed

PeterDaveHello deleted the OpenAI-o3-and-4o-mini branch April 16, 2025 19:40

minh-thai-gfg pushed a commit to GFG/litellm that referenced this pull request Apr 17, 2025

Add OpenAI o3 & 4o-mini (BerriAI#10065)

0b34b92

Reference: - https://platform.openai.com/docs/models/o3 - https://platform.openai.com/docs/models/o4-mini

PeterDaveHello changed the title ~~Add OpenAI o3 & 4o-mini~~ Add OpenAI o3 & o4-mini Apr 17, 2025

krrishdholakia added a commit that referenced this pull request Apr 18, 2025

fix(azure/o_series_transformation.py): fix azure o4 model routing

809eb85

Fixes #10065 (comment)

Uh oh!

Add OpenAI o3 & o4-mini #10065

Add OpenAI o3 & o4-mini #10065

Uh oh!

Conversation

PeterDaveHello commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Title

Pre-Submission checklist

Type

Changes

Uh oh!

vercel bot commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PeterDaveHello commented Apr 16, 2025

Uh oh!

krrishdholakia Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PeterDaveHello Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

elabbarw commented Apr 16, 2025

Uh oh!

Uh oh!

krrishdholakia commented Apr 16, 2025

Uh oh!

PeterDaveHello commented Apr 16, 2025

Uh oh!

elabbarw commented Apr 16, 2025

Uh oh!

jrkropp commented Apr 16, 2025

Uh oh!

krrishdholakia commented Apr 16, 2025

Uh oh!

miguelwon commented Apr 17, 2025

Uh oh!

krrishdholakia commented Apr 17, 2025

Uh oh!

miguelwon commented Apr 17, 2025

Uh oh!

valda-z commented Apr 17, 2025

Uh oh!

krrishdholakia commented Apr 18, 2025

Uh oh!

valda-z commented Apr 18, 2025

Uh oh!

krrishdholakia commented Apr 18, 2025

Uh oh!

valda-z commented Apr 18, 2025

Uh oh!

Uh oh!

PeterDaveHello commented Apr 16, 2025 •

edited

Loading

vercel bot commented Apr 16, 2025 •

edited

Loading

krrishdholakia Apr 16, 2025 •

edited

Loading