Skip to content

Crash in Py_Initialize in non-main thread in free-threading build #123022

Closed
@colesbury

Description

@colesbury
Contributor

Py_Initialize will crash if in the free-threaded build if it's not called from the main thread. Additionally, background threads may crash if the program allocates lots of memory in total (>=30 TiB, not necessarily at one time.)

The problem is related to our use of mimalloc in the free-threaded build. We don't set mimalloc's "default heap" for threads, but a few code paths assume that there is a default heap.

Originally posted by @ngoldbaum in #122918 (comment)

I'm seeing similar crashes in my PyO3 dev environment on MacOS. Interestingly, it doesn't happen if I use a debug python 3.13 free-threaded build.

May be completely unrelated to the Windows-specific issues described above.

Here's the traceback from the crash:

running 681 tests
test buffer::tests::test_compatible_size ... ok
test buffer::tests::test_element_type_from_format ... ok
test conversions::std::array::tests::array_try_from_fn ... ok
Process 18512 stopped
* thread #3, name = 'buffer::tests::test_bytes_buffer', stop reason = EXC_BAD_ACCESS (code=2, address=0x1015edb08)
    frame #0: 0x000000010132d048 libpython3.13t.dylib`chacha_block(ctx=0x00000001015edac8) at random.c:67:20 [opt]
   64
   65  	  // add scrambled data to the initial state
   66  	  for (size_t i = 0; i < 16; i++) {
-> 67  	    ctx->output[i] = x[i] + ctx->input[i];
   68  	  }
   69  	  ctx->output_available = 16;
   70
Target 0: (pyo3-71e3a8c0dab6095a) stopped.
warning: libpython3.13t.dylib was compiled with optimization - stepping may behave oddly; variables may not be available.
(lldb) bt
* thread #3, name = 'buffer::tests::test_bytes_buffer', stop reason = EXC_BAD_ACCESS (code=2, address=0x1015edb08)
  * frame #0: 0x000000010132d048 libpython3.13t.dylib`chacha_block(ctx=0x00000001015edac8) at random.c:67:20 [opt]
    frame #1: 0x00000001013218d0 libpython3.13t.dylib`_mi_os_get_aligned_hint [inlined] chacha_next32(ctx=0x00000001015edac8) at random.c:83:5 [opt]
    frame #2: 0x00000001013218bc libpython3.13t.dylib`_mi_os_get_aligned_hint [inlined] _mi_random_next(ctx=0x00000001015edac8) at random.c:149:25 [opt]
    frame #3: 0x00000001013218bc libpython3.13t.dylib`_mi_os_get_aligned_hint [inlined] _mi_heap_random_next(heap=0x00000001015ecf80) at heap.c:258:10 [opt]
    frame #4: 0x00000001013218b8 libpython3.13t.dylib`_mi_os_get_aligned_hint(try_alignment=33554432, size=1073741824) at os.c:118:19 [opt]
    frame #5: 0x000000010132ee50 libpython3.13t.dylib`unix_mmap_prim(addr=0x0000000000000000, size=1073741824, try_alignment=33554432, protect_flags=3, flags=4162, fd=1677721600) at prim.c:190:18 [opt]
    frame #6: 0x0000000101327f68 libpython3.13t.dylib`_mi_prim_alloc [inlined] unix_mmap(addr=0x0000000000000000, size=<unavailable>, try_alignment=<unavailable>, protect_flags=3, large_only=false, allow_large=<unavailable>, is_large=<unavailable>) at prim.c:297:9 [opt]
    frame #7: 0x0000000101327ed4 libpython3.13t.dylib`_mi_prim_alloc(size=1073741824, try_alignment=33554432, commit=<unavailable>, allow_large=<unavailable>, is_large=0x0000000170211bdf, is_zero=<unavailable>, addr=0x0000000170211b78) at prim.c:334:11 [opt]
    frame #8: 0x0000000101321c18 libpython3.13t.dylib`mi_os_prim_alloc(size=1073741824, try_alignment=33554432, commit=true, allow_large=<unavailable>, is_large=<unavailable>, is_zero=<unavailable>, stats=<unavailable>) at os.c:201:13 [opt]
    frame #9: 0x000000010131a648 libpython3.13t.dylib`_mi_os_alloc_aligned [inlined] mi_os_prim_alloc_aligned(size=1073741824, alignment=33554432, commit=true, allow_large=<unavailable>, is_large=0x0000000170211bdf, is_zero=0x0000000170211bde, base=<unavailable>, stats=<unavailable>) at os.c:234:13 [opt]
    frame #10: 0x000000010131a60c libpython3.13t.dylib`_mi_os_alloc_aligned(size=<unavailable>, alignment=33554432, commit=true, allow_large=<unavailable>, memid=0x0000000170211c68, tld_stats=<unavailable>) at os.c:320:13 [opt]
    frame #11: 0x000000010131bb84 libpython3.13t.dylib`mi_reserve_os_memory_ex(size=1073741824, commit=true, allow_large=<unavailable>, exclusive=false, arena_id=0x0000000170211ccc) at arena.c:813:17 [opt]
    frame #12: 0x0000000101319fd0 libpython3.13t.dylib`_mi_arena_alloc_aligned [inlined] mi_arena_reserve(req_size=33554432, allow_large=true, req_arena_id=0, arena_id=0x0000000170211ccc) at arena.c:362:11 [opt]
    frame #13: 0x0000000101319f6c libpython3.13t.dylib`_mi_arena_alloc_aligned(size=33554432, alignment=33554432, align_offset=0, commit=true, allow_large=true, req_arena_id=0, memid=0x0000000170211da8, tld=0x00000001016cc828) at arena.c:383:11 [opt]
    frame #14: 0x000000010132dde8 libpython3.13t.dylib`mi_segment_alloc [inlined] mi_segment_os_alloc(required=0, page_alignment=<unavailable>, eager_delayed=<unavailable>, req_arena_id=0, psegment_slices=<unavailable>, ppre_size=<unavailable>, pinfo_slices=<unavailable>, commit=<unavailable>, tld=0x00000001016cc490, os_tld=<unavailable>) at segment.c:823:42 [opt]
    frame #15: 0x000000010132ddd0 libpython3.13t.dylib`mi_segment_alloc(required=0, page_alignment=<unavailable>, req_arena_id=0, tld=<unavailable>, os_tld=<unavailable>, huge_page=0x0000000000000000) at segment.c:882:27 [opt]
    frame #16: 0x0000000101326a38 libpython3.13t.dylib`mi_segments_page_alloc [inlined] mi_segment_reclaim_or_alloc(heap=0x00000001016cac80, needed_slices=1, block_size=64, tld=0x00000001016cc490, os_tld=0x00000001016cc828) at segment.c:1489:10 [opt]
    frame #17: 0x0000000101326788 libpython3.13t.dylib`mi_segments_page_alloc(heap=0x00000001016cac80, page_kind=<unavailable>, required=64, block_size=64, tld=0x00000001016cc490, os_tld=0x00000001016cc828) at segment.c:1508:9 [opt]
    frame #18: 0x000000010132cc7c libpython3.13t.dylib`mi_page_fresh_alloc(heap=0x00000001016cac80, pq=0x00000001016cb150, block_size=64, page_alignment=<unavailable>) at page.c:284:21 [opt]
    frame #19: 0x0000000101324044 libpython3.13t.dylib`mi_find_page [inlined] mi_page_fresh(heap=0x00000001016cac80, pq=0x00000001016cb150) at page.c:305:21 [opt]
    frame #20: 0x0000000101324030 libpython3.13t.dylib`mi_find_page [inlined] mi_page_queue_find_free_ex(heap=0x00000001016cac80, pq=0x00000001016cb150, first_try=true) at page.c:782:12 [opt]
    frame #21: 0x0000000101323ff0 libpython3.13t.dylib`mi_find_page [inlined] mi_find_free_page(heap=0x00000001016cac80, size=<unavailable>) at page.c:821:10 [opt]
    frame #22: 0x0000000101323e6c libpython3.13t.dylib`mi_find_page(heap=0x00000001016cac80, size=64, huge_alignment=0) at page.c:920:12 [opt]
    frame #23: 0x0000000101316158 libpython3.13t.dylib`_mi_malloc_generic(heap=0x00000001016cac80, size=<unavailable>, zero=<unavailable>, huge_alignment=0) at page.c:946:21 [opt]
    frame #24: 0x0000000101422014 libpython3.13t.dylib`gc_alloc [inlined] _PyObject_MallocWithType(tp=<unavailable>, size=<unavailable>) at pycore_object_alloc.h:46:17 [opt]
    frame #25: 0x0000000101421ff4 libpython3.13t.dylib`gc_alloc(tp=<unavailable>, basicsize=<unavailable>, presize=0) at gc_free_threading.c:1695:17 [opt]
    frame #26: 0x0000000101421ea4 libpython3.13t.dylib`_PyObject_GC_New(tp=0x0000000101657268) at gc_free_threading.c:1716:20 [opt]
    frame #27: 0x00000001012f266c libpython3.13t.dylib`PyDict_New [inlined] new_dict(interp=0x0000000101698e80, keys=<unavailable>, values=0x0000000000000000, used=0, free_values_on_failure=0) at dictobject.c:929:14 [opt]
    frame #28: 0x00000001012f263c libpython3.13t.dylib`PyDict_New at dictobject.c:1026:12 [opt]
    frame #29: 0x0000000101386050 libpython3.13t.dylib`_PyUnicode_InitGlobalObjects [inlined] init_interned_dict(interp=0x0000000101698e80) at unicodeobject.c:284:37 [opt]
    frame #30: 0x000000010138604c libpython3.13t.dylib`_PyUnicode_InitGlobalObjects(interp=0x0000000101698e80) at unicodeobject.c:15016:9 [opt]
    frame #31: 0x00000001014521f0 libpython3.13t.dylib`pycore_interp_init [inlined] pycore_init_global_objects(interp=0x0000000101698e80) at pylifecycle.c:702:14 [opt]
    frame #32: 0x00000001014521dc libpython3.13t.dylib`pycore_interp_init(tstate=0x00000001016c9330) at pylifecycle.c:852:14 [opt]
    frame #33: 0x000000010144f89c libpython3.13t.dylib`Py_InitializeFromConfig [inlined] pyinit_config(runtime=<unavailable>, tstate_p=<unavailable>, config=0x0000000170212130) at pylifecycle.c:933:14 [opt]
    frame #34: 0x000000010144f784 libpython3.13t.dylib`Py_InitializeFromConfig [inlined] pyinit_core(runtime=<unavailable>, src_config=<unavailable>, tstate_p=<unavailable>) at pylifecycle.c:1096:18 [opt]
    frame #35: 0x000000010144f784 libpython3.13t.dylib`Py_InitializeFromConfig(config=0x00000001702123a0) at pylifecycle.c:1398:14 [opt]
    frame #36: 0x000000010144f9b0 libpython3.13t.dylib`Py_InitializeEx(install_sigs=<unavailable>) at pylifecycle.c:1436:14 [opt]
    frame #37: 0x0000000100105b2c pyo3-71e3a8c0dab6095a`pyo3::gil::prepare_freethreaded_python::_$u7b$$u7b$closure$u7d$$u7d$::h2cd40f096e08bff0((null)={closure_env#0} @ 0x00000001702125c7, (null)=0x0000000170212640) at gil.rs:69:13
    frame #38: 0x00000001000654fc pyo3-71e3a8c0dab6095a`std::sync::once::Once::call_once_force::_$u7b$$u7b$closure$u7d$$u7d$::hdab74cb30759684d(p=0x0000000170212640) at once.rs:208:40
    frame #39: 0x0000000100314024 pyo3-71e3a8c0dab6095a`std::sys::sync::once::queue::Once::call::h9dcea807fd617ccf at queue.rs:183:21 [opt]
    frame #40: 0x00000001000652f0 pyo3-71e3a8c0dab6095a`std::sync::once::Once::call_once_force::h6081e1899cf15adf(self=0x00000001004d8ea0, f={closure_env#0} @ 0x000000017021271f) at once.rs:208:9
    frame #41: 0x0000000100000b0c pyo3-71e3a8c0dab6095a`pyo3::gil::prepare_freethreaded_python::hb6ac25f766ef5eb8 at gil.rs:66:5
    frame #42: 0x0000000100000b7c pyo3-71e3a8c0dab6095a`pyo3::gil::GILGuard::acquire::hcdf69afc9ea1299e at gil.rs:174:21
    frame #43: 0x00000001001a1708 pyo3-71e3a8c0dab6095a`pyo3::marker::Python::with_gil::h089ec9aca00e522e(f={closure_env#0} @ 0x000000017021279f) at marker.rs:403:21
    frame #44: 0x0000000100216c2c pyo3-71e3a8c0dab6095a`pyo3::buffer::tests::test_bytes_buffer::h458708c8fd91dd1e at buffer.rs:849:9
    frame #45: 0x0000000100035e40 pyo3-71e3a8c0dab6095a`pyo3::buffer::tests::test_bytes_buffer::_$u7b$$u7b$closure$u7d$$u7d$::h8009c230847e71d5((null)=0x00000001702127fe) at buffer.rs:848:27

Both the debug python build and the optimized, crashing Python are built from source using pyenv.

Linked PRs

Activity

colesbury

colesbury commented on Aug 14, 2024

@colesbury
ContributorAuthor

The part about this not happening in a debug build makes sense. The crashing code path is disabled in debug builds (where MI_DEBUG=0):

#if (MI_SECURE>0 || MI_DEBUG==0) // security: randomize start of aligned allocations unless in debug mode
uintptr_t r = _mi_heap_random_next(mi_prim_get_default_heap());
init = init + ((MI_SEGMENT_SIZE * ((r>>17) & 0xFFFFF)) % MI_HINT_AREA); // (randomly 20 bits)*4MiB == 0 to 4TiB
#endif

I think this might have to do with re-initializing Python (i.e., Py_Initialize()) multiple times in the same process.

ngoldbaum

ngoldbaum commented on Aug 14, 2024

@ngoldbaum
Contributor

This is the relevant code in PyO3 that calls the C API:

https://github.com/PyO3/pyo3/blob/ffeb901db402189f16a8145c94cf024bed773cf6/src/gil.rs#L62-L75

When I run the test binary in a debugger, there's only one call to Py_Initialize before it crashes.

ZeroIntensity

ZeroIntensity commented on Aug 15, 2024

@ZeroIntensity
Member

Does this affect all calls to Py_Initialize, or just from PyO3? If it's the former, I would consider marking this as a release blocker, it would be good to get this fixed by the next 3.13 RC. Otherwise, we're limiting embedded projects from early testing.

ngoldbaum

ngoldbaum commented on Aug 15, 2024

@ngoldbaum
lysnikolaou

lysnikolaou commented on Aug 15, 2024

@lysnikolaou
Member

But that shouldn't be possible if this is the GIL-disabled build, right, since it's outside of the Py_GIL_DISABLED block?

Line 46 above is not inside an #ifdef Py_GIL_DISABLED. There's an #endif right above, no #else.

colesbury

colesbury commented on Aug 15, 2024

@colesbury
ContributorAuthor

The crash occurs when Python is initialized for the first time outside of the main thread. Here's a reproducer:

#include <Python.h>
#include <pthread.h>

void *thread_main(void *unused)
{
    Py_InitializeEx(0); // crashes!
}

int main()
{
    pthread_t thread;
    pthread_create(&thread, NULL, thread_main, NULL);
    pthread_join(thread, NULL);
    return 0;
}

The problem is that mimalloc is calling _mi_heap_random_next(mi_prim_get_default_heap()), but the default heap isn't initialized to a "real" heap (it's still mimalloc's "empty" heap).

This doesn't happen in the main thread because the main thread's default heap is initialized when the process starts.

I think the crash can also occur if the program allocates a lot of memory in total, because the condition is hint == 0 (uninitialized) or hint > MI_HINT_MAX (>30TiB).

uintptr_t hint = mi_atomic_add_acq_rel(&aligned_base, size);
if (hint == 0 || hint > MI_HINT_MAX) { // wrap or initialize

We can fix this by calling mi_thread_init() or ensuring that the default heap is set.

changed the title [-]Crash in `Py_Initialize` on macOS with PyO3 in free-threading build[/-] [+]Crash in `Py_Initialize` in non-main thread in free-threading build[/+] on Aug 15, 2024
added a commit that references this issue on Aug 15, 2024

pythongh-123022: Fix crash with `Py_Initialize` in background thread

added a commit that references this issue on Aug 17, 2024

gh-123022: Fix crash with `Py_Initialize` in background thread (#123052)

d061ffe
added a commit that references this issue on Aug 17, 2024

pythongh-123022: Fix crash with `Py_Initialize` in background thread (p…

6 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @colesbury@ngoldbaum@lysnikolaou@ZeroIntensity@kumaraditya303

        Issue actions

          Crash in `Py_Initialize` in non-main thread in free-threading build · Issue #123022 · python/cpython