Skip to content

Poor performance for large allocations in native-lib mode #4357

@nia-e

Description

@nia-e
Contributor

As of #4343, the benchmark results for huge_allocs are significantly worse (~20x) if the isolated allocator is enabled i.e. in native-lib mode. This could be alleviated by using mmap internally instead, which did not have this issue, but will possibly require overallocating if the requested alignment is greater than the system pagesize.

Todos / open questions:

  • Why is calling alloc::alloc() so much slower when asking for page-aligned memory?
    If it's fixable, bug the relevant people / open a PR to fix this
    If it's not fixable, consider switching over to mmaping its memory instead

Activity

RalfJung

RalfJung commented on May 29, 2025

@RalfJung
Member

@bjorn3 @lqd @nnethercote do you have any idea why page-aligned multiple-of-page-size allocations are slowing down jemalloc so much?

added
C-bugCategory: This is a bug.
I-slowImpact: Makes Miri even slower than it already is
A-nativeArea: calling native functions via FFI
on May 29, 2025
nnethercote

nnethercote commented on May 30, 2025

@nnethercote
Contributor

@bjorn3 @lqd @nnethercote do you have any idea why page-aligned multiple-of-page-size allocations are slowing down jemalloc so much?

Nope, but I will try summoning @glandium, who has forgotten more about jemalloc than I will ever know, in case he feels like answering a random question... (Hi Mike!)

glandium

glandium commented on May 30, 2025

@glandium

I'm not sure what might be going on here, especially regarding the scale of the mentioned difference. I'm also not that familiar with very recent versions of jemalloc. I would advise looking at profiles (and maybe also look at the difference on different platforms)

If I was to venture a guess, it could be the kernel zeroing fresh pages in the process of those allocations.

(Hey Nick!)

nia-e

nia-e commented on May 31, 2025

@nia-e
ContributorAuthor

I doubt it's the kernel. mmapping fresh pages (which are definitely zeroed) was near-exactly tied with jemallocing 16-byte-aligned memory, when both were requested as zeroed. The perf hit only appeared when the size was left unchanged but the alignment on jemalloc was upped to being the system pagesize, even hardcoding 4096 in the align field caused the same perf hit. It also seemed to vary a lot between runs; I saw ~8.5x slowdown on some and almost 30x on others, but both mmap and low-alignment jemalloc had very consistent times (+/- 5% or so on any given run)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-nativeArea: calling native functions via FFIC-bugCategory: This is a bug.I-slowImpact: Makes Miri even slower than it already is

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @RalfJung@glandium@nnethercote@nia-e

        Issue actions

          Poor performance for large allocations in native-lib mode · Issue #4357 · rust-lang/miri