Open
Description
Confirm this is an issue with the Python library and not an underlying OpenAI API
- This is an issue with the Python library
Describe the bug
There might be a memory leak when using the method .parse()
on AsyncCompletions
with Pydantic models created with create_model
. When submitting several calls, the memory usage keeps on rising. I haven't found any plateau yet, which could mean the parsers built upon these models might not be garbage collected.
To Reproduce
- Have a function that creates a Pydantic model with
create_model
- Have several calls where the response_format param always gets a new model from the function above
- Monitor the memory
We do have a work-around though. The leaking scenario will be called leaking
and the safe one non_leaking
in the snippets.
Please let me know if you need more info. Thanks a lot.
Code snippets
import asyncio
import gc
import os
from typing import List
from memory_profiler import profile
from openai import AsyncOpenAI
from openai.lib._parsing import type_to_response_format_param
from pydantic import Field, create_model
StepModel = create_model(
"Step",
explanation=(str, Field()),
output=(str, Field()),
)
def create_new_model():
"""This sounds useless as it is. In our business case, I'm generating a model that slightly different at each call, hence the use of create_model. This illustrates of a model that seems to always be the same keeps on adding up in the memory."""
return create_model(
"MathResponse",
steps=(List[StepModel], Field()),
final_answer=(str, Field()),
)
@profile()
async def leaking_call(client, new_model):
await client.beta.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "You are a helpful math tutor."},
{"role": "user", "content": "solve 8x + 31 = 2"},
],
response_format=new_model,
)
async def non_leaking_call(client, new_model):
await client.chat.completions.create(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "You are a helpful math tutor."},
{"role": "user", "content": "solve 8x + 31 = 2"},
],
response_format=type_to_response_format_param(new_model),
)
async def main():
client = AsyncOpenAI()
for _ in range(200):
# You can switch to `non_leaking_call` and see that the memory is correctly emptied
await leaking_call(client, create_new_model())
# We wanted to thoroughly check the memory usage, hence memory profiler + gc
gc.collect()
print(len(gc.get_objects()))
if __name__ == "__main__":
asyncio.run(main())
OS
macOS
Python version
Python 3.11.9
Library version
openai v1.64.0
Activity
RobertCraigie commentedon Feb 26, 2025
Thanks for the report, what version of Pydantic are you using?
anteverse commentedon Feb 26, 2025
Pydantic 2.10.6.
Also tested with 2.9.2 earlier