Closed
Description
With this code
pub fn foo(t: &mut Vec<usize>) {
let mut taken = std::mem::take(t);
taken.pop();
*t = taken;
}
Stable produces
playground::foo:
sub rsp, 24
movups xmm0, xmmword ptr [rdi]
movaps xmmword ptr [rsp], xmm0
mov rax, qword ptr [rdi + 16]
xor ecx, ecx
sub rax, 1
cmovae rcx, rax
mov qword ptr [rdi + 16], rcx
add rsp, 24
ret
Whereas beta/nightly produces
playground::foo:
push r15
push r14
push rbx
mov rbx, rdi
mov r14, qword ptr [rdi + 8]
mov r15, qword ptr [rdi + 16]
xorps xmm0, xmm0
movups xmmword ptr [rdi + 8], xmm0
mov rsi, qword ptr [rdi + 8]
test rsi, rsi
je .LBB0_2
shl rsi, 3
mov edi, 8
mov edx, 8
call qword ptr [rip + __rust_dealloc@GOTPCREL]
.LBB0_2:
xor eax, eax
sub r15, 1
cmovae rax, r15
mov qword ptr [rbx + 8], r14
mov qword ptr [rbx + 16], rax
pop rbx
pop r14
pop r15
ret
searched nightlies: from nightly-2022-07-02 to nightly-2022-07-03
regressed nightly: nightly-2022-07-03
searched commit range: 46b8c23...f2d9393
regressed commit: 0075bb4
bisected with cargo-bisect-rustc v0.6.4
Host triple: x86_64-unknown-linux-gnu
Metadata
Metadata
Assignees
Labels
Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: MIR inliningIssue: Problems and improvements with respect to performance of generated code.Medium priorityRelevant to the compiler team, which will review and decide on the PR/issue.Performance or correctness regression from stable to nightly.
Activity
nikic commentedon Nov 1, 2022
Godbolt: https://rust.godbolt.org/z/4GTrh1EGx
Result IR can be further optimized by GVN, so this might be addressable on the LLVM side.
nikic commentedon Nov 2, 2022
Looks like this got a bit worse on LLVM main because an additional assume is being preserved: https://llvm.godbolt.org/z/95eMe6j7q
Anyway, there is a phase ordering problem here. MemCpyOpt runs after GVN, and only at that point do we convert the memcpy into a memset, which makes the following load from it easy to fold.
An easy fix would probably be to support memset in InstCombine load store forwarding. But this is no longer going to fix this issue due to the aforementioned assume issue. Ugh.
nikic commentedon Nov 3, 2022
Upstream patch for InstCombine: https://reviews.llvm.org/D137323
An alternative solution would be to move MemCpyOpt prior to GVN, but I'm not sure whether that would cause other issues.
apiraino commentedon Nov 3, 2022
WG-prioritization assigning priority (Zulip discussion).
@rustbot label -I-prioritize +P-medium
nikic commentedon Nov 3, 2022
Upstream patch for SimplifyCFG: https://reviews.llvm.org/D137339
Together these produce the following final IR:
Ignoring the opportunity to form a usub.sat, this is optimal.
clubby789 commentedon Dec 29, 2022
Nightly now compiles to
3 remaining items
nikic commentedon Dec 29, 2022
Needs codegen test.
clubby789 commentedon Dec 29, 2022
Would just
// CHECK-NOT: __rust_dealloc
work?nikic commentedon Dec 29, 2022
Sounds reasonable.
Auto merge of rust-lang#106272 - clubby789:codegen-test-103840, r=nikic
the8472 commentedon Feb 16, 2023
Reopening because it working on nightly is not really reliable behavior. #106790 and #108106 both change vec field order and in each case it breaks the test.
the8472 commentedon Apr 29, 2023
I'm no longer having issues with the codegen test, LLVM 16 upgrade seems to have made it more reliable.