-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Closed
Closed
Copy link
Labels
Needs TestsUnit test(s) needed to prevent regressionsUnit test(s) needed to prevent regressionsSparseSparse Data TypeSparse Data Typegood first issue
Milestone
Description
Tried to simplify Block.quantile by arranging for it to only have to handle 2D case by having Series.quantile dispatch to DataFrame implementation. Ended up getting failures in pandas/tests/series/test_quantile.py test_quantile_sparse
ser = pd.Series([0., None, 1., 2.], dtype='Sparse[float]')
df = pd.DataFrame(ser)
>>> ser.quantile(0.5)
1.0
>>> ser.quantile([0.5])
0.5 1.0
dtype: float64
>>> df.quantile(0.5)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pandas/core/frame.py", line 7760, in quantile
transposed=is_transposed)
File "pandas/core/internals/managers.py", line 500, in quantile
return self.reduction('quantile', **kwargs)
File "pandas/core/internals/managers.py", line 432, in reduction
axe, block = getattr(b, f)(axis=axis, axes=self.axes, **kwargs)
File "pandas/core/internals/blocks.py", line 1530, in quantile
result = _nanpercentile(values, qs * 100, axis=axis, **kw)
File "pandas/core/internals/blocks.py", line 1484, in _nanpercentile
mask = mask.reshape(values.shape)
AttributeError: 'SparseArray' object has no attribute 'reshape'
>>> df.quantile([0.5])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pandas/core/frame.py", line 7760, in quantile
transposed=is_transposed)
File "pandas/core/internals/managers.py", line 500, in quantile
return self.reduction('quantile', **kwargs)
File "pandas/core/internals/managers.py", line 432, in reduction
axe, block = getattr(b, f)(axis=axis, axes=self.axes, **kwargs)
File "pandas/core/internals/blocks.py", line 1511, in quantile
axis=axis, **kw)
File "pandas/core/internals/blocks.py", line 1484, in _nanpercentile
mask = mask.reshape(values.shape)
AttributeError: 'SparseArray' object has no attribute 'reshape'
datetime64[ns, tz]
breaks in a slightly different way (presumably all ExtensionBlocks will fail):
dti = pd.date_range('2016-01-01', periods=3, tz='US/Pacific')
ser = pd.Series(dti)
df = pd.DataFrame(ser)
>>> df.quantile(0.5)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pandas/core/frame.py", line 7760, in quantile
transposed=is_transposed)
File "pandas/core/internals/managers.py", line 500, in quantile
return self.reduction('quantile', **kwargs)
File "pandas/core/internals/managers.py", line 473, in reduction
values = _concat._concat_compat([b.values for b in blocks])
File "pandas/core/dtypes/concat.py", line 174, in _concat_compat
return np.concatenate(to_concat, axis=axis)
ValueError: need at least one array to concatenate
>>> df.quantile([0.5])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pandas/core/frame.py", line 7760, in quantile
transposed=is_transposed)
File "pandas/core/internals/managers.py", line 500, in quantile
return self.reduction('quantile', **kwargs)
File "pandas/core/internals/managers.py", line 473, in reduction
values = _concat._concat_compat([b.values for b in blocks])
File "pandas/core/dtypes/concat.py", line 174, in _concat_compat
return np.concatenate(to_concat, axis=axis)
ValueError: need at least one array to concatenate
xref #24583
Metadata
Metadata
Assignees
Labels
Needs TestsUnit test(s) needed to prevent regressionsUnit test(s) needed to prevent regressionsSparseSparse Data TypeSparse Data Typegood first issue
Type
Projects
Relationships
Development
Select code repository
Activity
jbrockmendel commentedon Jan 4, 2019
IntNA is also a catastrophe for quantile
TomAugspurger commentedon Jan 4, 2019
Do you think this will need to be pushed down to the array for ExtensionArrays?
jbrockmendel commentedon Jan 4, 2019
quantile
itself? Probably not. ForSparseArray
a patch is now in place that avoids the immediate problem. For IntNA it looks like the problem is in_try_coerce_result
not handling things correctly. ForDatetimeTZBlock
the problem is in_concat._concat_compat
. It's eclectic.I think we'll want to define _try_coerce_result (and possibly _try_coerce_args, not sure) in terms of _holder._from_sequence (and possibly _holder._unbox_scalar or something resembling _scalar_from_string).
pglopezamaya commentedon Apr 23, 2019
Any news on the SparseArray' object has no attribute 'reshape' patch?
mroeschke commentedon Apr 5, 2020
These cases look to work on master. Could use a test