Skip to content

Rust binding to LFS64 ABI-compat symbols rather than public names on musl libc #2934

@richfelker

Description

@richfelker

It's come to our attention as part of removing the legacy LFS64 glibc-ABI-compat symbols that Rust's libc crate is apparently binding to the LFS64 ("large file support") symbol names like fopen64, etc. rather than the standard symbol names. In the musl API/ABI (defined by our public headers) the LFS64 interfaces exist only as macros remapping to the standard function names, and the nonstandard LFS64 names are purely a glibc-ABI-compat mechanism that will not be linkable in the future (i.e. only available to already-dynamic-linked binaries referencing these symbols). This is part of an effort to get rid of the macros (which cause a lot of havok on various C++ projects), which is blocked on having the symbols present, because the prevalence of autoconf link-only tests makes it unsafe to have the symbols linkable without declarations in the headers.

What needs to happen here in order for Rust to reference only the standard libc interface names on musl targets?

Activity

bossmc

bossmc commented on Sep 30, 2022

@bossmc
Contributor

Is the position that the xxx64 calls are simply aliases to xxx guaranteed to be true going forwards (i.e. are the xxx functions already effectively the xxx64 versions)? The libc crate can perform the same aliasing when targeting -musl fairly easily. An alternative would be to remove these symbols from the libc crate on the musl targets, but that feels like a bigger breaking change in the rust-world and would be a problem if the ABI ever changed such that xxx64 actually did something other than be aliased to xxx.

Similar question around off64_t and friends, musl currently defines these to their non-64 counterparts, is that also a position we can rely on going forwards?

I'm guessing the answers to the above are all "yes" since I'm assuming that musl's offset/size types have always been 64-bit and the methods have always used the 64-bit underlying syscalls. Hence there being no need to have separate names for compatibility reasons.

Lastly is there a list of affected symbols or is it simply "all the ones that end with 64"?

richfelker

richfelker commented on Sep 30, 2022

@richfelker
Author

Is the position that the xxx64 calls are simply aliases to xxx guaranteed to be true going forwards

The intent going forward is that the xxx64 ones don't exist. These symbols never existed as part of a programming interface (if you used the xxx64 identifiers in the programming interface, they were macros that evaluated to the non-64 ones to prevent actually emitting unwanted references to those symbols) and were only there for ABI-compat purposes. I wasn't aware until somebody tested removal that the rust libc crate was emitting references to them. Presumably this was not intentional but just fallout of doing what it was doing on glibc.

(i.e. are the xxx functions already effectively the xxx64 versions)?

Yes. off_t is, always has been, and always will be 64-bit in musl. There are no 32-bit-off_t interfaces.

Lastly is there a list of affected symbols or is it simply "all the ones that end with 64"?

That's basically it (modulo readdir64_r where the 64 isn't at the end). But you can find detailed lists below, as:

bossmc

bossmc commented on Sep 30, 2022

@bossmc
Contributor

Excellent, I've pushed a proposed change in #2935 that leaves the xxx64 symbols in the API of the libc crate (so this isn't a platform-specific breaking change) but aliases them to the xxx symbol in the libc library.

There's code in the stdlib (and likely elsewhere in the wider world) that explicitly calls the xxx64 functions hence my preference not to break the crate API. Does mean that rust will pretend forever than musl has LFS64 symbols, but it won't actually use them.

richfelker

richfelker commented on Sep 30, 2022

@richfelker
Author

Great, that seems like the correct fix. I don't think there's any reason to object to having the aliases in a FFI context like this.

One (I think the main) reason we didn't do it in a pure C context was that the types were wrong: without #define stat64 stat, the LFS64 stat functions that, on the interface level, are supposed to take struct stat64, would have been aliased to functions expecting struct stat, and the stores through the argument pointers would have been violations of type-based aliasing rules.

bossmc

bossmc commented on Sep 30, 2022

@bossmc
Contributor

In that (those?) cases is there an actual (ABI) difference between struct stat64 and struct stat (beyond the name)? If there is then the aliasing at FFI level is not going to fly, if they're structurally identical then I think we're good (I'm not going to open the can of worms that is "is unsafe rust allowed to break C's type-based aliasing rules?").

richfelker

richfelker commented on Sep 30, 2022

@richfelker
Author

No, there's no ABI difference. At an ABI/FFI level everything is fine. It's only not-okay if you consider the C sources/IR cross-translation-unit as having to follow the C TBA rules and you defined struct stat and struct stat64 as two different sructure types with identical members (which we specifically didn't do) rather than making stat64 a macro expanding to stat. Anyway this all ended up being a mess because C++ programs sometimes (rightly) don't like hitting macros like that where they didn't expect them, and that's why we're trying to bury all this and just keep the ABI-compat part. But yeah, that's all C stuff that doesn't affect the rust libc crate or rust API/ABI in any way, IMO. It's just context for why we're trying to do this change on the musl side.

workingjubilee

workingjubilee commented on Feb 22, 2023

@workingjubilee
Member

In that (those?) cases is there an actual (ABI) difference between struct stat64 and struct stat (beyond the name)? If there is then the aliasing at FFI level is not going to fly, if they're structurally identical then I think we're good (I'm not going to open the can of worms that is "is unsafe rust allowed to break C's type-based aliasing rules?").

The answer is "yes, actually", because lifetime analysis replaces it functionally. As a result, Unsafe Rust has historically been allowed to rely on pointer-casting in a way that would make a seasoned C developer blanch, so it would be very hard to change, even amongst things people say are "very hard to change".

added a commit that references this issue on May 28, 2023

6 remaining items

added a commit that references this issue on Jun 29, 2023
added a commit that references this issue on Aug 7, 2023
added this to the 1.0 milestone on Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @Amanieu@bossmc@richfelker@tgross35@workingjubilee

        Issue actions

          Rust binding to LFS64 ABI-compat symbols rather than public names on musl libc · Issue #2934 · rust-lang/libc