Skip to content

What about: Virtual memory effects #28

Open
@RalfJung

Description

@RalfJung
Member

During the pointer docs update, a discussion spawned off about effects of virtual memory: What do we still guarantee when different virtual addresses map to the same physical address? Some of libstd, e.g. copy_nonoverlapping, will start misbehaving in that situation. FWIW, C seems to just not care: memmove has the same problem.

Activity

gnzlbg

gnzlbg commented on Sep 18, 2018

@gnzlbg
Contributor

Is it undefined behavior to have two pointers with different values referring to the same object?

TL;DR:

Every byte has a unique address. [...] Undefined behavior does not even enter the equation as you simply cannot have two pointers to the same object that have different values.

and

The C++ standard only concerns about one way to view the memory. If the system uses virtual memory, then the standard is only concerned about virtual memory. You can never have two pointers with different value referring to the same object.

With pretty much the recommendation that anything doing anything like this should use volatile load and stores.

RalfJung

RalfJung commented on Sep 18, 2018

@RalfJung
MemberAuthor

Undefined behavior does not even enter the equation as you simply cannot have two pointers to the same object that have different values.

Oh but you can (but for unrelated reasons.^^)

RalfJung

RalfJung commented on Sep 18, 2018

@RalfJung
MemberAuthor

More on topic, AFAIK LLVM does not use aliasing information to inform ptr equality tests, and vice versa. So with a refined view of aliasing, I see no reason why LLVM's memory model would not work with overlapping virtual memory.

gnzlbg

gnzlbg commented on Sep 18, 2018

@gnzlbg
Contributor

Oh but you can (but for unrelated reasons.^^)

Could you point me to which part of the blog post allows you to construct two pointers of different addresses that refer to the same object in C or C++ ?

The only relevant example that I find there is the one of the out-of-bound pointers, but that's the opposite case, where one has two pointers, pointing to the same address, but that refer to two different objects - technically, one refers to an object, the other does not and is therefore illegal to dereference.

RalfJung

RalfJung commented on Sep 18, 2018

@RalfJung
MemberAuthor

You said "different values". The value of a pointer includes its provenance information. But I am just being picky here, that's off-topic -- sorry.

alercah

alercah commented on Sep 18, 2018

@alercah

This should definitely be a topic of discussion, especially because of the possibility that a filesystem process ends up mmaping its own data. Thar be dragons here, but hopefully we can come up with sensible guarantees at least.

gnzlbg

gnzlbg commented on Sep 18, 2018

@gnzlbg
Contributor

You said "different values". The value of a pointer includes its provenance information. But I am just being picky here, that's off-topic -- sorry.

@RalfJung No offense taken, that's an important difference! With mmap one can create two pointers of different values (with different bit-pattern and different provenance) that on purpose refer to the same memory. In the out-of-bound example, obtaining two pointers with the same bitpattern and different provenance (e.g. obtained from two different malloc calls) feels incidental to me.

EDIT: What I mean here with "on purpose" and "incidental" is that in the mmap case, it is the user's intent to be able to manipulate the same object/memory/... via two pointers with different values (with different bit-pattern and different provenance). In the out-of-bounds case, that's less clear to me. We don't necessarily have to treat both cases equally.

added
C-open-questionCategory: An open question that we should revisit
A-memoryTopic: Related to memory accesses
on Aug 14, 2019
JakobDegen

JakobDegen commented on May 29, 2023

@JakobDegen
Contributor

It seems unlikely to me that we'll want to adjust the AM memory model specifically to support this. I think that basically gives us very little wiggle room here. Option one for users is to use virtual memory shenanigans in a way that makes sense to the AM. This probably means not double-mapping pages. Option 2 is to use volatile operations, which are the standard tool for "I have some concept of memory that the AM doesn't understand."

Is there anything I'm missing here?

added
S-pending-designStatus: Resolving this issue requires addressing some open design questions
and removed
C-open-questionCategory: An open question that we should revisit
on May 29, 2023
comex

comex commented on May 30, 2023

@comex

A one-mapping-per-process limit would be bad for composability. It would mean that libraries essentially could not make non-volatile accesses to writable mappings at all – even if they avoided data races using some external synchronization mechanism or using atomics – since they wouldn't know if some other library in the same process was using the same file.

(That said, for the case of atomics, aren't you technically supposed to use volatile atomics for cross-process synchronization anyway? But the standard library has no support for those.)

Is there any actual problem here other than copy_overlapping? The Python example doesn't count; it's not specific to having multiple mappings. (If the Python code fails to preserve the expected behavior of memory reads and writes, then the memory mapping becomes something like MMIO, 'special' memory where reads and writes have arbitrary effects. That case clearly requires volatile accesses. If the Python code does preserve the expected behavior of reads and writes, then it doesn't matter.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-memoryTopic: Related to memory accessesS-pending-designStatus: Resolving this issue requires addressing some open design questions

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @comex@RalfJung@gnzlbg@glaebhoerl@alercah

        Issue actions

          What about: Virtual memory effects · Issue #28 · rust-lang/unsafe-code-guidelines