Redundant Copies with #[repr(align)] Enum References

When creating references to `#[repr(align)]` types wrapped in enums, LLVM generates suboptimal assembly with redundant memory operations, despite the reference being unused. This occurs even at `opt-level=3`.
I tried this code: (opt-level=3)
https://godbolt.org/z/P8E4hsdbn
```rust
#![crate_type = "lib"]
#[repr(align(64))]
pub struct Align64(i32);

pub enum Enum64 {
    A(Align64),
    B(i32),
}

/// Processes data and returns an Enum64 variant
/// Logs intermediate state for debugging purposes
#[no_mangle]
pub fn process_data(a: Align64) -> Enum64 {
    let result = Enum64::A(a);
    
    // Common debugging pattern - logging intermediate values
    log_intermediate(&result);
    result
}

#[inline(never)]
fn log_intermediate(e: &Enum64) {
    // The empty function still forces the reference to be created
}
```

I expected to see this happen: 
```rust
process_data:
        mov     rax, rdi
        movaps  xmm0, xmmword ptr [rsi]
        movaps  xmm1, xmmword ptr [rsi + 16]
        movaps  xmm2, xmmword ptr [rsi + 32]
        movaps  xmm3, xmmword ptr [rsi + 48]
        movaps  xmmword ptr [rdi + 112], xmm3
        movaps  xmmword ptr [rdi + 96], xmm2
        movaps  xmmword ptr [rdi + 80], xmm1
        movaps  xmmword ptr [rdi + 64], xmm0
        mov     dword ptr [rdi], 0
        ret
```

Instead, this happened: 
```rust
process_data:
        mov     rax, rdi
        movups  xmm0, xmmword ptr [rsi]
        movups  xmm1, xmmword ptr [rsi + 16]
        movups  xmm2, xmmword ptr [rsi + 32]
        movups  xmm3, xmmword ptr [rsi + 48]
        movups  xmmword ptr [rsp - 16], xmm3
        movups  xmmword ptr [rsp - 32], xmm2
        movups  xmmword ptr [rsp - 48], xmm1
        movups  xmmword ptr [rsp - 64], xmm0
        mov     dword ptr [rdi], 0
        movups  xmm0, xmmword ptr [rsp - 124]
        movups  xmm1, xmmword ptr [rsp - 108]
        movups  xmm2, xmmword ptr [rsp - 92]
        movups  xmm3, xmmword ptr [rsp - 76]
        movups  xmmword ptr [rdi + 4], xmm0
        movups  xmmword ptr [rdi + 20], xmm1
        movups  xmmword ptr [rdi + 36], xmm2
        movups  xmmword ptr [rdi + 52], xmm3
        movups  xmm0, xmmword ptr [rsp - 60]
        movups  xmmword ptr [rdi + 68], xmm0
        movups  xmm0, xmmword ptr [rsp - 44]
        movups  xmmword ptr [rdi + 84], xmm0
        movups  xmm0, xmmword ptr [rsp - 28]
        movups  xmmword ptr [rdi + 100], xmm0
        movups  xmm0, xmmword ptr [rsp - 16]
        movups  xmmword ptr [rdi + 112], xmm0
        ret
```

**Performance Impact**
1.Instruction Count: 24 vs 8 instructions (3x increase)
2.Memory Operations:
-2x bandwidth usage (128B vs 64B transferred)
-Unnecessary stack spills
3.Instruction Selection:
-Uses `movups` (unaligned) instead of `movaps` (aligned)
-Missed opportunity for aligned vector ops

**Real-World Relevance**
This pattern occurs in:
1.Debug logging (even when logs are disabled)
2.Generic code passing references
3.Derive macros (e.g., `#[derive(Debug)]`)
4.Error handling paths

Could you please review the situation? Thank you!

### Meta



```
rustc 1.85.0-nightly (d117b7f21 2024-12-31)
binary: rustc
commit-hash: d117b7f211835282b3b177dc64245fff0327c04c
commit-date: 2024-12-31
host: x86_64-unknown-linux-gnu
release: 1.85.0-nightly
LLVM version: 19.1.6
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Redundant Copies with #[repr(align)] Enum References #140182

Meta

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Redundant Copies with #[repr(align)] Enum References #140182

Description

Meta

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions