Skip to content

Avoid changing the PYMEM_DOMAIN_RAW allocator during initialization and finalization #111924

Closed
@colesbury

Description

@colesbury
Contributor

Bug report

CPython current temporarily changes PYMEM_DOMAIN_RAW to the default allocator during initialization and shutdown. The motivation is to ensure that core runtime structures are allocated and freed using the same allocator. However, modifying the current allocator changes global state and is not thread-safe even with the GIL. Other threads may be allocating or freeing objects use PYMEM_DOMAIN_RAW; they are not required to hold the GIL to call PyMem_RawMalloc/PyMem_RawFree.

We can avoid changing global state while still ensuring that we use a consistent allocator during initialization and shutdown.

PyThread_type_lock

Many of the runtime structures are PyThread_type_lock objects. We can avoid allocation/freeing entirely for these locks by using PyMutex or PyRawMutex instead.

cpython/Python/pystate.c

Lines 396 to 418 in b9f814c

static int
alloc_for_runtime(PyThread_type_lock locks[NUMLOCKS])
{
/* Force default allocator, since _PyRuntimeState_Fini() must
use the same allocator than this function. */
PyMemAllocatorEx old_alloc;
_PyMem_SetDefaultAllocator(PYMEM_DOMAIN_RAW, &old_alloc);
for (int i = 0; i < NUMLOCKS; i++) {
PyThread_type_lock lock = PyThread_allocate_lock();
if (lock == NULL) {
for (int j = 0; j < i; j++) {
PyThread_free_lock(locks[j]);
locks[j] = NULL;
}
break;
}
locks[i] = lock;
}
PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &old_alloc);
return 0;
}

Calls to PyMem_RawMalloc, PyMem_RawCalloc, PyMem_RawFree, etc.

For the other calls to PyMem_RawMalloc, etc. where we know we want to use the default allocator, we should directly call a new internal-only function that always uses the default allocator. This will avoid unnecessarily modifying global state.

For example, we can add a new function _PyMem_DefaultRawMalloc that behaves like PyMem_RawMalloc, except that it is not modifiable by _PyMem_SetDefaultAllocator.

For an example implementation in the nogil-3.12 fork, see colesbury/nogil-3.12@d13c63dee9.

Linked PRs

Activity

colesbury

colesbury commented on Nov 10, 2023

@colesbury
ContributorAuthor

cc @vstinner and @ericsnowcurrently in case you are interested in this.

ericsnowcurrently

ericsnowcurrently commented on Nov 10, 2023

@ericsnowcurrently
Member

I'm a little unclear on this. Nearly all of the C-API is unavailable until global runtime initialization finishes and reverts to that state almost as soon as finalization begins. At both of those points only the main thread should be running. 12 Consequently, switching the raw allocator during init/fini shouldn't cause any problems.

Thus I'm not sure where the problem is.

Also, just to be clear, we do the raw allocator dance during init/fini for more than just locks. 3 All of that happens during global init/fini so the same single-threaded situation applies.

Of course, the actual operation of changing the allocator is thread-safe, protected by one of the global locks. Hence, what we do in Python/tracemalloc.c (e.g. _PyTraceMalloc_Start()) is fine.

Footnotes

  1. Daemon threads add a wrinkle to this during finalization, but we handle them in a way that shouldn't be affected by the raw allocator.

  2. Even in embedded scenarios, where there might be other non-Python threads running already, only the "main" thread (where Py_Initialize() was called) should be able to make use of the runtime until initialization finishes or once finalization starts.

  3. I did a quick grep for "_PyMem_SetDefaultAllocator(PYMEM_DOMAIN_RAW". There's the "inittab" and builtin modules table stuff in Python/import.c. Then there's part of the config data in Python/initconfig.c and the path data in Python/pathconfig.c. We also do the raw allocator dance for PySys_AddWarnOption() (the "preinit_entry" stuff), but that function was removed from the API and only exists for stable ABI compatibility.

ericsnowcurrently

ericsnowcurrently commented on Nov 10, 2023

@ericsnowcurrently
Member

FWIW, I'm on board with switching the various global locks to PyMutex or PyRawMutex, regardless of the raw allocator concerns.

colesbury

colesbury commented on Nov 10, 2023

@colesbury
ContributorAuthor

Nearly all of the C-API is unavailable until global runtime initialization finishes and reverts to that state almost as soon as finalization begins.

"Nearly all of the C-API" here notably does not include PyMem_RawMalloc/Free. They are listed in https://docs.python.org/3/c-api/init.html#before-python-initialization.

At both of those points only the main thread should be running.

This is not the case during finalization. As you mention, daemon threads may still be running. So can any thread created by C (i.e., outside the threading module). We don't even have a hard guarantee that non-daemon threading module threads are finished: if you press ctrl-c during shutdown, we stop waiting on non-daemon threads and continue with finalization.

Any of these threads and (random C) code may call PyMem_RawMalloc and PyMem_RawFree because these APIs do not require holding the GIL.

Also, just to be clear, we do the raw allocator dance during init/fini for more than just locks.

Yeah, those are what I was referring to in the "Calls to PyMem_RawMalloc, PyMem_RawCalloc, PyMem_RawFree, etc." section.

colesbury

colesbury commented on Nov 10, 2023

@colesbury
ContributorAuthor

FWIW, I'm on board with switching the various global locks to PyMutex or PyRawMutex, regardless of the raw allocator concerns.

Great! I'll probably tackle that part first, both because we seem to be on the same page there and it'll keep the PRs smaller.

added a commit that references this issue on Nov 16, 2023

pythongh-111924: Use PyMutex for runtime global locks.

added a commit that references this issue on Dec 7, 2023

gh-111924: Use PyMutex for Runtime-global Locks. (gh-112207)

cf6110b
added a commit that references this issue on Feb 11, 2024

pythongh-111924: Use PyMutex for Runtime-global Locks. (pythongh-112207)

added a commit that references this issue on Sep 2, 2024

pythongh-111924: Use PyMutex for Runtime-global Locks. (pythongh-112207)

7 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @colesbury@ericsnowcurrently

        Issue actions

          Avoid changing the PYMEM_DOMAIN_RAW allocator during initialization and finalization · Issue #111924 · python/cpython