Closed
Description
For example given this code:
pub mod a {
static mut A: i32 = 0;
pub fn foo() -> i32 {
unsafe { A }
}
pub fn set_foo(a: i32) {
unsafe { A = a; }
}
}
pub mod b {
pub fn bar() {}
}
Let's compile without ThinLTO and take a look at the symbols:
$ rustc +nightly -C codegen-units=8 foo.rs --crate-type lib
$ readelf -s libfoo.rlib
File: libfoo.rlib(foo.foo0.rcgu.o)
Symbol table '.symtab' contains 8 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS foo0-8787f43e282added3762
2: 0000000000000000 4 OBJECT LOCAL DEFAULT 7 _ZN3foo1a1A17h6d4d0e10f3b
3: 0000000000000000 0 SECTION LOCAL DEFAULT 3
4: 0000000000000000 0 SECTION LOCAL DEFAULT 5
5: 0000000000000000 0 SECTION LOCAL DEFAULT 7
6: 0000000000000000 7 FUNC GLOBAL DEFAULT 3 _ZN3foo1a3foo17hb1e13e720
7: 0000000000000000 7 FUNC GLOBAL DEFAULT 5 _ZN3foo1a7set_foo17h5e0b4
File: libfoo.rlib(foo.foo1.rcgu.o)
Symbol table '.symtab' contains 4 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS foo1-8787f43e282added3762
2: 0000000000000000 0 SECTION LOCAL DEFAULT 3
3: 0000000000000000 1 FUNC GLOBAL DEFAULT 3 _ZN3foo1b3bar17h8ae358bf9
File: libfoo.rlib(rust.metadata.bin)
readelf: Error: Not an ELF file - it has the wrong magic bytes at the start
File: libfoo.rlib(foo.foo0.rcgu.bytecode.encoded)
readelf: Error: Not an ELF file - it has the wrong magic bytes at the start
File: libfoo.rlib(foo.foo1.rcgu.bytecode.encoded)
readelf: Error: Not an ELF file - it has the wrong magic bytes at the start
Here we're particularly interested in ZN3foo1a1A17h6d4d0e10f3b
, the a::A
constant from above. As expected this is LOCAL
and DEFAULT
, notably not exported from the object file as it's still internalized.
Let's instead compile with ThinLTO:
$ rustc +nightly -C codegen-units=8 foo.rs --crate-type lib -Z thinlto
$ readelf -s libfoo.rlib
readelf: Error: Not an ELF file - it has the wrong magic bytes at the start
readelf: Error: Not an ELF file - it has the wrong magic bytes at the start
readelf: Error: Not an ELF file - it has the wrong magic bytes at the start
File: libfoo.rlib(foo.foo0-8787f43e282added376259c1adb08b80.rs.rcgu.o)
Symbol table '.symtab' contains 7 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS foo0-8787f43e282added3762
2: 0000000000000000 0 SECTION LOCAL DEFAULT 3
3: 0000000000000000 0 SECTION LOCAL DEFAULT 5
4: 0000000000000000 4 OBJECT GLOBAL HIDDEN 7 _ZN3foo1a1A17h6d4d0e10f3b
5: 0000000000000000 7 FUNC GLOBAL DEFAULT 3 _ZN3foo1a3foo17hb1e13e720
6: 0000000000000000 7 FUNC GLOBAL DEFAULT 5 _ZN3foo1a7set_foo17h5e0b4
File: libfoo.rlib(foo.foo1-8787f43e282added376259c1adb08b80.rs.rcgu.o)
Symbol table '.symtab' contains 4 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS foo1-8787f43e282added3762
2: 0000000000000000 0 SECTION LOCAL DEFAULT 3
3: 0000000000000000 1 FUNC GLOBAL DEFAULT 3 _ZN3foo1b3bar17h8ae358bf9
File: libfoo.rlib(rust.metadata.bin)
File: libfoo.rlib(foo.foo0-8787f43e282added376259c1adb08b80.rs.rcgu.bytecode.encoded)
File: libfoo.rlib(foo.foo1-8787f43e282added376259c1adb08b80.rs.rcgu.bytecode.encoded)
Oh dear! Our _ZN3foo1a1A17h6d4d0e10f3b
symbol has been promoted to GLOBAL
and HIDDEN
. That's not good! This symbol, a::A
, should stay internalized even when compiling with ThinLTO
Activity
alexcrichton commentedon Nov 29, 2017
I've been trying to figure out what's going on here but I'm sort of quickly coming to the end of my rope. A few points of note though...
Right now ThinLTO works by needing complete and total knowledge (AFAIK) of exported and internal symbols. Thankfully we as the compiler actually have this information so we're just handing it down to ThinLTO. Notably, we're creating our ThinLTO data with a list of symbols to preserve.
Currently, however, we're recursively walking all symbols and creating a very large
GUIDPreservedSymbols
set. This set is the transitive addition of each symbol followed by each symbol it then in turn references. This is not done in LLVM's source but it something I originally implemented to get the link errors to go away.The
GUIDPreservedSymbols
variable is used for two purposes. First it's used to create aDeadSymbols
set but it turns out that LLVM's implementation ofcomputeDeadSymbols
already has this recursive nature. Notably theDeadSymbols
set is the same regardless of whetherGUIDPreservedSymbols
has only the symbols we passed in or all the transitive symbols they need as well. So things are looking good! This custom logic for "spider all the transitive deps" is not necessary for the first use case ofGUIDPreservedSymbols
.The next use of
GUIDPreservedSymbols
comes later when we call thethinLTOInternalizeAndPromoteInIndex
function. This function will decide, for the entire set of ThinLTO units, what functions are being exported linkage-wise and which are not. Namely theisExported
function decides whether a symbol is exported or not (clever name eh?)So it turns out that one clause of this
isExported
closure is to just test for membership inGUIDPreservedSymbols
. Because our behavior today is to put everything into this set then this is almost always returningtrue
! As a result we get the behavior in the OP of the issue, where internal statics are being promoted to being exported unnecessarily.So alright this
GUIDPreservedSymbols
is tricky business. I've discovered that for the usage incomputeDeadSymbols
it's not necessary to include all transitive dependencies. ForisExported
, however, if we don't include all transitive dependencies then we get tons of link errors unfortunately. I have yet to determine if these link errors are a bug in LLVM or if they're a bug in how we're using ThinLTO. Unfortunately there is not an easy fix for this issue right now as far as I can tell.I'm going to try to work with @dcci to see if he can help me drill into this.