Skip to content

non well-formed symbol mangling and bad symbol tracking issue #46552

Closed as not planned
@m4b

Description

@m4b

So, here's my ultimate desire for this eventually so that stuff like this doesn't happen again:

  1. There is a single, canonical name mangler in rust. Anything that ever emits a binary symbol uses this mangler, without exception, and can't modify the bytes; it and it alone are responsible for the ABI facing names.
  2. Even better, this is enforced in the types, and only at the last minute is the raw str pointer or bytes passed to the backend to emit ABI names
  3. The rust testing suite verifies that every symbol in a generated binary are validly mangled.

You might be surprised to learn that 1 is not true in practice. So I've opened this issue for tracking various cases I've found, and what follows are the specimens:

Examples

As of rustc 1.22, on the following program compiled via rustc hello.rs:

fn main() {
    println!("hello");
}

I am seeing the following issues:

Spurious thread locals

symbol: .tdata._ZN3std10sys_common11thread_info11THREAD_INFO7__getit5__KEY17hc69464ef038d7e85E

40    LOCAL      TLS         .tdata._ZN3std10sys_common11thread_info11THREAD_INFO7__getit5__KEY17hc69464ef038d7e85E  0x0       .tdata(20)              0x0

Example source location: https://github.com//m4b/rust/blob/383e313d181eceb3155eb1089d448144f830ee23/src/libstd/sys_common/thread_info.rs#L21

Why incorrect

  1. does not start with _ZN.

  2. This looks like its meant to be the section name. The exact same TLS variable does occur later on, with the same address:

40    LOCAL      TLS         std::sys_common::thread_info::THREAD_INFO::__getit::__KEY::hc69464ef038d7e85                            0x30      .tdata(20)              0x0

but correctly mangled.

This could be an llvm bug; I have verified this only seems to occur to variables generated via the thread_local! macro, and furthermore, is inside of the crate .rlib static archive. (you can output this by something like:

ar p libstd-fe0b1b991511fcaa.rlib std-fe0b1b991511fcaa.std0.rust-cgu.o > libstd.o

(your libstd-<hash> will vary of course tho)

E.g. here is every symbol with either .tdata in its name or referencing that section in rust 1.22 libstd object file (inside .rlib):

sections:
  3473   .tdata._ZN3std11collections4hash3map11RandomState3new4KEYS7__getit5__KEY17h98644cd8ad1049dbE                                                                                                                                                                                                                                             SHT_PROGBITS   WRITE ALLOC TLS        0x42030     0x0    0x20                       0x0       0x8    
  3543   .tdata._ZN3std2io5stdio12LOCAL_STDOUT7__getit5__KEY17h53b08df14c3cb33dE                                                                                                                                                                                                                                                                  SHT_PROGBITS   WRITE ALLOC TLS        0x42440     0x0    0x28                       0x0       0x20   
  3627   .tdata._ZN3std10sys_common11thread_info11THREAD_INFO7__getit5__KEY17hc69464ef038d7e85E                                                                                                                                                                                                                                                   SHT_PROGBITS   WRITE ALLOC TLS        0x42840     0x0    0x30                       0x0       0x20   
  3647   .tdata._ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h715a8958c4cd11efE                                                                                                                                                                                                                                                                 SHT_PROGBITS   WRITE ALLOC TLS        0x42980     0x0    0x28                       0x0       0x20   
  3659   .tdata._ZN3std9panicking18update_panic_count11PANIC_COUNT7__getit5__KEY17h01a9f669bb84595fE                                                                                                                                                                                                                                              SHT_PROGBITS   WRITE ALLOC TLS        0x42a18     0x0    0x18                       0x0       0x8    
  3665   .tdata._ZN3std4rand10thread_rng14THREAD_RNG_KEY7__getit5__KEY17h8ec4cb227256fe90E                                                                                                                                                                                                                                                        SHT_PROGBITS   WRITE ALLOC TLS        0x42a80     0x0    0x10                       0x0       0x8    
  symbols:
                 0    LOCAL      TLS         .tdata._ZN3std10sys_common11thre…   0x0       .tdata._ZN3std10sys_common11thread_info11THREAD_INFO7__getit5__KEY17hc69464ef038d7e85E(3627)                                                                                                                                                                                                                                                  0x0    
                 0    LOCAL      TLS         .tdata._ZN3std11collections4hash…   0x0       .tdata._ZN3std11collections4hash3map11RandomState3new4KEYS7__getit5__KEY17h98644cd8ad1049dbE(3473)                                                                                                                                                                                                                                            0x0    
                 0    LOCAL      TLS         .tdata._ZN3std2io5stdio12LOCAL_S…   0x0       .tdata._ZN3std2io5stdio12LOCAL_STDOUT7__getit5__KEY17h53b08df14c3cb33dE(3543)                                                                                                                                                                                                                                                                 0x0    
                 0    LOCAL      TLS         .tdata._ZN3std4rand10thread_rng1…   0x0       .tdata._ZN3std4rand10thread_rng14THREAD_RNG_KEY7__getit5__KEY17h8ec4cb227256fe90E(3665)                                                                                                                                                                                                                                                       0x0    
                 0    LOCAL      TLS         .tdata._ZN3std9panicking12LOCAL_…   0x0       .tdata._ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h715a8958c4cd11efE(3647)                                                                                                                                                                                                                                                                0x0    
                 0    LOCAL      TLS         .tdata._ZN3std9panicking18update…   0x0       .tdata._ZN3std9panicking18update_panic_count11PANIC_COUNT7__getit5__KEY17h01a9f669bb84595fE(3659)                                                                                                                                                                                                                                             0x0    
                 0    LOCAL      TLS         _ZN3std10sys_common11thread_info…   0x30      .tdata._ZN3std10sys_common11thread_info11THREAD_INFO7__getit5__KEY17hc69464ef038d7e85E(3627)                                                                                                                                                                                                                                                  0x0    
                 0    LOCAL      TLS         _ZN3std11collections4hash3map11R…   0x20      .tdata._ZN3std11collections4hash3map11RandomState3new4KEYS7__getit5__KEY17h98644cd8ad1049dbE(3473)                                                                                                                                                                                                                                            0x0    
                 0    LOCAL      TLS         _ZN3std2io5stdio12LOCAL_STDOUT7_…   0x28      .tdata._ZN3std2io5stdio12LOCAL_STDOUT7__getit5__KEY17h53b08df14c3cb33dE(3543)                                                                                                                                                                                                                                                                 0x0    
                 0    LOCAL      TLS         _ZN3std4rand10thread_rng14THREAD…   0x10      .tdata._ZN3std4rand10thread_rng14THREAD_RNG_KEY7__getit5__KEY17h8ec4cb227256fe90E(3665)                                                                                                                                                                                                                                                       0x0    
                 0    LOCAL      TLS         _ZN3std9panicking12LOCAL_STDERR7…   0x28      .tdata._ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h715a8958c4cd11efE(3647)                                                                                                                                                                                                                                                                0x0    
                 0    LOCAL      TLS         _ZN3std9panicking18update_panic_…   0x18      .tdata._ZN3std9panicking18update_panic_count11PANIC_COUNT7__getit5__KEY17h01a9f669bb84595fE(3659)                                                     

The spurious symbols will never be gc'd by the linker, and they will never get referenced; so they're just taking up space (albeit not much).

backtrace.rs nested static

symbol: _ZN3std10sys_common9backtrace11log_enabled7ENABLED17hc187c5b3618ccb2eE.0.0

Example source location: https://github.com//m4b/rust/blob/383e313d181eceb3155eb1089d448144f830ee23/src/libstd/sys_common/backtrace.rs#L148

Why Incorrect

No mangled symbol is allowed to have characters after the final E, but this has .0.0

2622e8    LOCAL      OBJECT      _ZN3std10sys_common9backtrace11log_enabled7ENABLED17hc187c5b3618ccb2eE.0.0 0x8       .bss(27)                0x0 

I have definitely seen other examples of this, and with different numbers at the end; I think it has to do with nested statics somehow.

Nightly

On nightly it looks like there has been a pretty substantial regression w.r.t. valid symbol names being output:

bingrep -D  -t 65 hello-nightly  | grep -e "E...[[:digit:]] "
26ce98    GLOBAL     OBJECT      ref.7.llvm.D64EB761                                                  0x18      .data.rel.ro(23)        0x2
26e160    GLOBAL     OBJECT      _ZN3std3sys4unix2os8ENV_LOCK17hbf5ac5d1fa9db31cE.llvm.D64EB761       0x28      .data(26)               0x2    
 5bf60    GLOBAL     OBJECT      str.4.llvm.6C0E7CF1                                                  0x1a      .rodata(16)             0x2    
 5bf7a    GLOBAL     OBJECT      str.5.llvm.6C0E7CF1                                                  0x0       .rodata(16)             0x2    
 548f0    GLOBAL     FUNC        _ZN4core3ptr13drop_in_place17hd0b6a86080ab42c4E.llvm.F74E5798        0x6       .text(14)               0x2    
 5bf80    GLOBAL     OBJECT      ref.7.llvm.6C0E7CF1                                                  0x40      .rodata(16)             0x2    
 5d1d0    GLOBAL     OBJECT      str.a.llvm.F74E5798                                                  0x1f      .rodata(16)             0x2    
26d920    GLOBAL     OBJECT      panic_bounds_check_loc.e.llvm.F74E5798                               0x18      .data.rel.ro(23)        0x2  

This runs the whole gamut of functions, global memory, read only strings, all apparently (sometimes) having extra characters appended.

Special Mentions

{{closure}} in symbols are useless, and very hard to print in debuggers.

E.g.:

47d80    LOCAL      FUNC        core::fmt::Formatter::pad_integral::{{closure}}::h6acabc645f5ef2ad 0x10f     .text(14)               0x0

which is from the use of this closure:

https://github.com//m4b/rust/blob/383e313d181eceb3155eb1089d448144f830ee23/src/libcore/fmt/mod.rs#L1108

The compiler knows the line number (it will even omit this sometimes like @[closure; mod.rs:1108] or whatever); why not just output that instead of {{closure}}?

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-linkageArea: linking into static, shared libraries and binariesC-enhancementCategory: An issue proposing an enhancement or a PR with one.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions