Skip to content

Functions applied on .expanding() receive ndarrays rather than pandas objects #12950

Closed
@max-sixty

Description

@max-sixty
Contributor

.apply(...) on window-based groupbys receive an ndarray rather than a Pandas object.
Is this intentional? Is it a performance issue?

It means that functions that work on Series can't necessarily be used on groupbys, which limits abstraction.

In [7]: pd.Series(range(10)).expanding().apply(lambda x: x.pow(2).sum())
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-7-970afaf538ff> in <module>()
----> 1 pd.Series(range(10)).expanding().apply(lambda x: x.pow(2).sum())

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/window.py in apply(self, func, args, kwargs)
    929     @Appender(_shared_docs['apply'])
    930     def apply(self, func, args=(), kwargs={}):
--> 931         return super(Expanding, self).apply(func, args=args, kwargs=kwargs)
    932 
    933     @Substitution(name='expanding')

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/window.py in apply(self, func, args, kwargs)
    547                                       kwargs)
    548 
--> 549         return self._apply(f, center=False)
    550 
    551     def sum(self, **kwargs):

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/window.py in _apply(self, func, window, center, check_minp, how, **kwargs)
    487                 result = np.apply_along_axis(calc, self.axis, values)
    488             else:
--> 489                 result = calc(values)
    490 
    491             if center:

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/window.py in calc(x)
    482 
    483                 def calc(x):
--> 484                     return func(x, window, min_periods=self.min_periods)
    485 
    486             if values.ndim > 1:

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/window.py in f(arg, window, min_periods)
    545             minp = _use_window(min_periods, window)
    546             return algos.roll_generic(arg, window, minp, offset, func, args,
--> 547                                       kwargs)
    548 
    549         return self._apply(f, center=False)

pandas/algos.pyx in pandas.algos.roll_generic (pandas/algos.c:40613)()

<ipython-input-7-970afaf538ff> in <lambda>(x)
----> 1 pd.Series(range(10)).expanding().apply(lambda x: x.pow(2).sum())

AttributeError: 'numpy.ndarray' object has no attribute 'pow'
Expected (without the coercion):
In [9]: pd.Series(range(10)).expanding().apply(lambda x: pd.Series(x).pow(2).sum())
Out[9]: 
0      0.0
1      1.0
2      5.0
3     14.0
4     30.0
5     55.0
6     91.0
7    140.0
8    204.0
9    285.0
dtype: float64

Pandas 0.18

Activity

jreback

jreback commented on Apr 21, 2016

@jreback
Contributor

dupe of #5071 (and noted in the master issue). This is not hard to fix now that we are in the new structure.

added
ReshapingConcat, Merge/Join, Stack/Unstack, Explode
on Apr 21, 2016
max-sixty

max-sixty commented on Apr 21, 2016

@max-sixty
ContributorAuthor

Ah, mea culpa. I searched for numpy rather than ndarray

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Duplicate ReportDuplicate issue or pull requestReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @jreback@max-sixty

        Issue actions

          Functions applied on .expanding() receive ndarrays rather than pandas objects · Issue #12950 · pandas-dev/pandas