Description
Bugzilla Link | 52146 |
Version | trunk |
OS | Linux |
Attachments | bitcode from rust#89609 |
CC | @alex |
Extended Description
xref: rust-lang/rust#89609
After enabling the new pass manager in rustc, one particular crate showed a large increase in compile time, but only when targeting s390x. I attached the bitcode from rustc, and I can reproduce the problem with opt -O1 on main.
$ llvm-dis <rustc_ast_lowering-cgu.0.rcgu.thin-lto-after-patch.bc | wc -lL
157863 2306
$ time opt <rustc_ast_lowering-cgu.0.rcgu.thin-lto-after-patch.bc -O1 -S | wc -lL
794414 4515800
real 2m4.580s
user 2m4.675s
sys 0m0.221s
There are two instances of that longest 4.5M line length, and both are huge phi nodes just before invoking a function ending in 17hde8a472161ebd31bE (so you can search for that). The predecessors look like a mass of loopexit.split-lp expansions gone rogue.
The same input behaves reasonably with the old PM, both optimizing faster and resulting in a smaller bitcode.
$ time opt <rustc_ast_lowering-cgu.0.rcgu.thin-lto-after-patch.bc -O1 -S --enable-new-pm=0 | wc -lL
148220 2306
real 0m2.266s
user 0m2.322s
sys 0m0.024s
If you edit out the "target-cpu"="z10" attributes in the IR, this reproducer can also be forced to -mtriple x86_64, and that runs fine with new PM.