Skip to content

Inplace returns self? #1893

Closed
Closed
@jseabold

Description

@jseabold
Contributor

Is there any reason that whenever I'm doing inplace=True I always get self returned? Obviously not a huge deal, but this is kind of wart-y IMO. I wouldn't expect inplace to return anything.

Activity

changhiskhan

changhiskhan commented on Sep 11, 2012

@changhiskhan
Contributor

Good idea actually. Useful to make the user more explicit about invoking side-effects.

wesm

wesm commented on Sep 13, 2012

@wesm
Member

I go back and forth on it. At some point I'd decided it was better to have a consistent API w.r.t. return values (vs. None in the inplace=True case), e.g. whether or not inplace=True, you can always count on getting a reference back to the modified object.

jseabold

jseabold commented on Sep 16, 2012

@jseabold
ContributorAuthor

Sure, but this is different than the in-place operations in python and in numpy. When I was first using pandas, I was really thrown by the default, make copies everywhere and return an object, especially when python and numpy lead me to expect in-place operation (e.g., sort, assigment to a view). I was bit by not catching this often (kind of like mpl returns things that you don't really need). Then I discovered the inplace keyword. Great, because I almost always want to do things in-place and avoid all the extra typing of assignment, though I have to use the keyword now everywhere. Just seems unnecessary to return the object since I explicitly asked for inplace and I already have the reference to it. Just noise at the interpreter when working interactively.

jseabold

jseabold commented on Oct 18, 2012

@jseabold
ContributorAuthor

Just as a follow-up, when writing notebooks I have to put semi-colons after every line where I do in-place operations so it doesn't barf the returned self.

michaelaye

michaelaye commented on Dec 3, 2012

@michaelaye
Contributor

Wes, your argument confuses me. Do you consider the inplace option a special case or not? Also, I just learned that ipython keeps unassigned memory objects alive for the history (the _ thingie). Is this true for notebooks as well and could this be an argument for really not returning an object when doing things inplace?

wesm

wesm commented on Dec 3, 2012

@wesm
Member

I think the proposal on the table is to always return None when using inplace=True. Moving this to 0.10

ghost

ghost commented on Dec 3, 2012

@ghost

Note that not returning self breaks the fluent interface: a.opA().opB().opC()

jseabold

jseabold commented on Dec 3, 2012

@jseabold
ContributorAuthor

Well, inplace is optional, so the easy solution is don't use inplace if you want to in turn do an operation on the returned object.

wesm

wesm commented on Dec 3, 2012

@wesm
Member

Another place where Python's eager evaluation can be a weakness

ghost

ghost commented on Dec 3, 2012

@ghost

those are two orthogonal considerations. one (arguably) should not force the other.

jseabold

jseabold commented on Dec 3, 2012

@jseabold
ContributorAuthor

Fair enough, but pandas is still an outlier in this respect compared to many of the methods in numpy and python itself. Admittedly, my argument for a (default) inplace not returning self is because I want to save myself typing and improve readability of output at the interpreter for teaching, presenting, or demonstrating. I don't often have serious concerns about memory use and readability of scripts.

What's the alternative? Having options where inplace can be 'true', 'false', or 'return'?

wesm

wesm commented on Dec 3, 2012

@wesm
Member

Or you could provide some kind of chainable, inplace interface. I'm thinking a lot lately about building a DSL layer around pandas so you could do things like:

frame do {
    .dropna axis=1
    ab_diff = a - b       
} group by key1 key2  {
    max(ab_diff)
    std(a)
}

and have that be as fast and memory-efficient as possible. And then you could easily chain "in place operations" and get what you expect

michaelaye

michaelaye commented on Dec 3, 2012

@michaelaye
Contributor

I fail to see the 'orthogonality' (maybe because 'orthogonal' is linguistically overrated, IMHO). Your claim that these design questions would be independent (i.e. 'orthogonal'), supports the use of something like obj.opA().obB(inplace=True).obC(). I even don't want to start thinking about what I just did there and which of all the objects flying around has what content now. The cleanest interface for me is: When I do inplace, it effects my original object, if not, it is safe.

ghost

ghost commented on Dec 3, 2012

@ghost
  • Actually, the result returned from the code you quoted would contain the expected result of those operations.
  • since you didn't use inplace in the first operation, all the interim results should be GC'd anyway.
  • Actually, this could be a useful idiom:
obj.expensive_op(inplace=true).ViewOp1()
obj.ViewOp2()

is not that bad IMHO.

  • IPython's history behaviour is a tooling issue, and can probably be disabled when needed.
  • Seems to me that the reason you don't want to think about what you did there is mostly
    because it's clear to you that you might be misusing the API.

52 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @michaelaye@jseabold@wesm@njsmith@changhiskhan

        Issue actions

          Inplace returns self? · Issue #1893 · pandas-dev/pandas