Skip to content

Generator size: unwinding and drops force extra generator state allocation #59123

Closed
@Nemo157

Description

@Nemo157
Member
#![feature(generators, generator_trait)]

use std::ops::Generator;

struct Foo([u8; 1024]);

impl Drop for Foo {
    fn drop(&mut self) {}
}

fn simple() -> impl Generator<Yield = (), Return = ()> {
    static || {
        let first = Foo([0; 1024]);
        let _second = first;
        yield;
    }
}

fn complex() -> impl Generator<Yield = (), Return = ()> {
    static || {
        let first = Foo([0; 1024]);
        { foo(); fn foo() {} }
        let _second = first;
        yield;
    }
}

fn main() {
    dbg!(std::mem::size_of_val(&simple()));
    dbg!(std::mem::size_of_val(&complex()));
}

The two generators returned by simple and complex should be equivalent, but complex takes twice as much space:

[foo.rs:29] std::mem::size_of_val(&simple()) = 1028
[foo.rs:30] std::mem::size_of_val(&complex()) = 2056

Dumping out the MIR (with rustc 1.34.0-nightly (f66e4697a 2019-02-20)) shows an issue with how unwinding from foo interacts with the two stack slots for first and _second, using a dynamic drop flag means that first is "live" through the path that goes through the yield, even though the drop flag is guaranteed to be false. (The below graph shows the basic blocks, with the psuedo-code run in them and which variables are alive when exiting the block):

MIR graph

Activity

Nemo157

Nemo157 commented on Mar 12, 2019

@Nemo157
MemberAuthor

@rustbot modify labels: A-generators and T-compiler.

added
T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.
on Mar 12, 2019
matprec

matprec commented on Mar 23, 2019

@matprec
Contributor

Is this related to #52924?

Nemo157

Nemo157 commented on Mar 23, 2019

@Nemo157
MemberAuthor

@MSleepyPanda in as much as it’s about generators being too big. The specific optimisation proposed there won’t help here as first and second are both live over the same yield point. What should really happen is that first is not kept live across the yield at all and it should be allocated in the resume function stack instead of the generator state. (And then some sort of copy-elision optimisation might eliminate that allocation and use the allocation in the generator state directly, but IMO that’s less important (and probably more difficult) than ensuring the memory usage is reduced).

cramertj

cramertj commented on Apr 2, 2019

@cramertj
Member
andreytkachenko

andreytkachenko commented on May 29, 2019

@andreytkachenko

Another example, without drop:

#![feature(generators, generator_trait)]

use std::ops::Generator;

struct Foo([u8; 1024]);

fn simple() -> impl Generator<Yield = (), Return = ()> {
    static || {
        let first = Foo([0; 1024]);
        let _second = first;
        yield;
    }
}

fn complex() -> impl Generator<Yield = (), Return = ()> {
    static || {
        fn foo(_: &mut Foo) {}
        
        let mut first = Foo([0; 1024]);
        foo(&mut first);
        let mut second = first;
        foo(&mut second);
        let mut third = second;
        foo(&mut third);
        let mut _fourth = third;
        
        yield;
    }
}

fn main() {
    dbg!(std::mem::size_of_val(&simple()));
    dbg!(std::mem::size_of_val(&complex()));
}

outputs

[src/main.rs:32] std::mem::size_of_val(&simple()) = 4
[src/main.rs:33] std::mem::size_of_val(&complex()) = 3076
tmandry

tmandry commented on Jun 12, 2019

@tmandry
Member

I was thinking, we can solve this by adding the following rules to our MaybeStorageLive dataflow analysis (possibly being renamed to RequiresStorage):

  1. (Existing) StorageLive(x) => mark x live
  2. (Existing) StorageDead(x) => mark x dead
  3. If a local is moved from, and has never been mutably borrowed, mark it dead
  4. If (any part of) a local is initialized, mark it live

We must not optimize away storage of locals that are mutably borrowed, because as @matthewjasper notes in #61430, it isn't decided that the following is UB:

let mut x = String::new();
let p = &mut x as *mut String;
let y = x;
p.write(String::new());

It's an open question of whether we can say "the local hasn't been mutably borrowed up to here" when evaluating rule 3. I'd prefer to make the optimization as smart as we can, but MIR probably allows borrowing a moved-from value and mutating it.

Is this sound?

cc @cramertj @eddyb @matthewjasper @RalfJung @Zoxc

eddyb

eddyb commented on Jun 12, 2019

@eddyb
Member

The "mutably" of "mutably borrowed" is a red herring IMO, unless you want to check for Freeze, which will conservatively default to "may contain interior mutability" once generic parameters are thrown into the mix.

RalfJung

RalfJung commented on Jun 12, 2019

@RalfJung
Member

@tmandry Interesting. How bad would it be to relax this to "if a local is moved from and never has had its address taken"? Then we can be sure without any assumptions about Stacked Borrows that direct accesses to the local variable are the only way to observe it, and those will be prevented after a move. This would also alleviate @eddyb's concern I think.

Also, what is the granularity here? Without Stacked Borrows assumptions we can only do this on a per-local level, not on a per-field-of-a-struct level. Taking the address of one field leaks the address of other fields (if layout assumptions are made).

29 remaining items

cramertj

cramertj commented on Jun 14, 2019

@cramertj
Member

Moving to "deferred" by the same logic as #52924 (comment). @rust-lang/lang feel free to put the labels back if you disagree.

added
AsyncAwait-TriagedAsync-await issues that have been triaged during a working group meeting.
and removed
AsyncAwait-PolishAsync-await issues that are part of the "polish" area
on Jun 14, 2019
tmandry

tmandry commented on Jun 14, 2019

@tmandry
Member

I'd personally like to see this fixed before stabilization. It can cause the same exponential size effects as we were seeing before #60187, albeit in different contexts.

I'm hoping to have a fix up for review soon.

eddyb

eddyb commented on Jun 18, 2019

@eddyb
Member

@RalfJung But wasn't the point of "Operand::Move doesn't invalidate borrows" that source-level moves don't invalidate borrows?
We could certainly emit pairs of llvm.lifetime.{end,start} calls after an Operand::Move, without adding pairs of Storage{Dead,Live} statements into the MIR, if the goal is to make it UB to reuse an old pointer. But I thought you wanted to allow reinitialization after a move, with an old pointer?

RalfJung

RalfJung commented on Jun 18, 2019

@RalfJung
Member

The only thing I want is a precise definition of the semantics in a way that can be dynamically checked (e.g. in Miri), and ideally I also want the semantics to not be full of weird special cases. ;)

We could certainly emit pairs of llvm.lifetime.{end,start} calls after an Operand::Move, without adding pairs of Storage{Dead,Live} statements into the MIR, if the goal is to make it UB to reuse an old pointer.

How would that work? Wouldn't that mean that legal MIR code (that uses some kind of trick to reinitialize after a move) becomes UB in LLVM?

eddyb

eddyb commented on Jun 18, 2019

@eddyb
Member

How would that work? Wouldn't that mean that legal MIR code (that uses some kind of trick to reinitialize after a move) becomes UB in LLVM?

No, I was referring to the case where we want the semantics of Operand::Move to be that they invalidate any outstanding borrows, similar to Storage{Dead,Live} but without bloating the MIR/impacting analyses which rely on some sort of dominance relationship.

I don't understand your position now. Are you saying you don't mind if source-level moves invalidate borrows, you just don't want it be be encoded into Operand in MIR, but rather something more like StorageDead? That would make sense, I just kept thinking you were worried about source-level moves.

RalfJung

RalfJung commented on Jun 18, 2019

@RalfJung
Member

Let's continue at #61849 (comment).

added a commit that references this issue on Jul 2, 2019

Rollup merge of rust-lang#61922 - tmandry:moar-generator-optimization…

2a2d71f
added a commit that references this issue on Jul 2, 2019

Auto merge of #61922 - tmandry:moar-generator-optimization, r=matthew…

848e0a2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

A-async-awaitArea: Async & AwaitA-coroutinesArea: CoroutinesAsyncAwait-TriagedAsync-await issues that have been triaged during a working group meeting.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    Participants

    @eddyb@Nemo157@nikomatsakis@RalfJung@andreytkachenko

    Issue actions

      Generator size: unwinding and drops force extra generator state allocation · Issue #59123 · rust-lang/rust