Closed
Description
A common issue [citation needed] comes up in groupby.apply when a user passes a function that modifies its argument in-place. This is because we call the function on the first group first to see if we can use a fastpath, and then again when we apply it to all groups.
We could add a kwarg to specify if a function may mutate its data, e.g. gb.apply(myfunc, may_mutate=True)
in which case we could avoid re-calling the function (probably means no fast-path). With may_mutate=False
we could use a context in which we set the data's flags to be immutable before calling anything on it.
Metadata
Metadata
Assignees
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
jreback commentedon Feb 6, 2020
dask has the keyword pure for this purpose
we could adopt it
cc @TomAugspurger
cc @mrocklin
mrocklin commentedon Feb 6, 2020
In hindsight the term
pure
was a bad choice for us. It means lots of things, not all of which you may intend to communicate here.mroeschke commentedon Jun 28, 2020
Looks like this is a duplicate of #12653