Closed
Description
I've tried, out of curiosity, a floating point arithmetic test and found quite a big difference between C++ and Rust.
The code used in rust
pub struct Stats
{
x: f32,
y: f32,
z: f32
}
pub fn sum(a: &Stats, b: &Stats) -> Stats
{
Stats {
x: a.x + b.x,
y: a.y + b.y,
z: a.z + b.z
}
}
The code used in C++
struct Stats
{
float x;
float y;
float z;
};
Stats sum(const Stats &a, const Stats &b)
{
return Stats {
a.x + b.x,
a.y + b.y,
a.z + b.z
};
}
Here is a link to a godbolt for side-by-side comparision of assembly output: https://godbolt.org/z/dqc4b74rv
Rust seem to absolutely want the floats back into e* registers instead of keeping them in xmm registers, C++ leaves them into the xmm registers. In some cases it might more advantageous to leave the floats in xmm registers for future operations on them rather then passing them back into the e* registers.
Metadata
Metadata
Assignees
Labels
Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generationArea: Floating point numbers and arithmeticCategory: This is a bug.Issue: Problems and improvements with respect to performance of generated code.Relevant to the compiler team, which will review and decide on the PR/issue.
Activity
Urgau commentedon Dec 2, 2021
After a quick godbolt I find that
clang
andrustc
use the same representation (%Stats = type { float, float, float }
) expect forrustc <= 1.47
(%Stats = type { [0 x i32], float, [0 x i32], float, [0 x i32], float, [0 x i32] }
). Not sure if that change something because their all at0
.Another difference between the two
rustc
is the return type:void
forrustc <= 1.47
andi96
forrustc > 1.47
which cause somebitcast
andload
at the end, which might explain the difference in codegen. Also why is thisi96
and notfloat
like the rest ?The last noticeable difference is that
clang
process the data in<2 x float>, float
instead offloat, float, float
which allow more optimizations to be done.My hypothesis is that this codegen issue is probably cause by the
i96
return type, which is probably cause by a regression in codegen inrustc > 1.47
but I'm not an expert inrustc_codegen_llvm
orllvm
itself so this needs to be confirm by the rustc llvm workgroup.clang (trunk with
-O0
)Assembly with -O3
rustc (1.56 with
-C opt-level=0 --emit=llvm-ir
)Assembly with -O
rustc (1.47 with
-C opt-level=0 --emit=llvm-ir
)Assembly with -O
Urgau commentedon Dec 2, 2021
I can confirm that the problem here is the return type
i96
in the LLVM IR because if I use an out parameter the assembly is what I would expect (without the vector optimization done by clang) (godbolt):LLVM IR at -C opt-level = 0
Yuri6037 commentedon Dec 2, 2021
The question then becomes why is rustc forcing an i96 as return type?
MSxDOS commentedon Dec 3, 2021
This seems to be the same regression as #85265
17 remaining items