Skip to content

API/REGR: construction of Series with scalar-like / len-1 lists #20391

@jorisvandenbossche

Description

@jorisvandenbossche
Member

At geopandas some tests started failing with pandas master:

In [8]: from geopandas import GeoSeries
   ...: from shapely.geometry import Point

In [9]: p = Point(1, 2)

In [10]: GeoSeries(p, index=['a', 'b', 'c', 'd'])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-10-9a0cacdb2179> in <module>()
----> 1 GeoSeries(p, index=['a', 'b', 'c', 'd'])

/home/joris/scipy/geopandas/geopandas/geoseries.py in __new__(cls, data, index, crs, **kwargs)
     96                 name = kwargs.get('name', None)
     97             else:
---> 98                 s = pd.Series(data, index=index, **kwargs)
     99                 # prevent trying to convert non-geometry objects
    100                 if s.dtype != object and not s.empty:

/home/joris/scipy/pandas/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    253                             'Length of passed values is {val}, '
    254                             'index implies {ind}'
--> 255                             .format(val=len(data), ind=len(index)))
    256                 except TypeError:
    257                     pass

ValueError: Length of passed values is 1, index implies 4

previously this replicated the single point multiple times, just as pd.Series(1, index=['a', 'b', 'c', 'd']) gives a Series with four 1's.

This is related to #19714, which removed the broadcasting of 1-length lists in the Series constructor (pd.Series([1], index=['a', 'b', 'c', 'd'])

The reason that geopandas converted the geometry to single element lists, is because geometries are convertable to array (and some are also iterable), and hence not seen as a 'scalar' by pandas (added 4 years ago: geopandas/geopandas#70).

It still works when you do not pass an index:

In [36]: GeoSeries(p)
Out[36]:
0    POINT (1 2)
dtype: object

Note there is also some inconsistency within pandas itself:

In [39]: pd.Series(p)
Out[39]: 
0    POINT (1 2)
dtype: object

In [40]: pd.Series(p, index=['a', 'b', 'c', 'd'])
...
ValueError: Wrong number of items passed 2, placement implies 4

(because in the first case when no index is specifed, p is converted to [p] before passing it to _sanitize_array, it works, but in the seconds case _sanitize_array converts the point p to np.array[1, 2]) (array of its coordinates))

Activity

added this to the 0.23.0 milestone on Mar 17, 2018
jorisvandenbossche

jorisvandenbossche commented on Mar 17, 2018

@jorisvandenbossche
MemberAuthor

So the question is, do we want to keep the special case of len-1 lists being broadcasted? (so add that behaviour back)
But I agree this is somewhat strange behaviour.

And if not, do we have others ways to ensure pandas regards something as a scalar?

jreback

jreback commented on Mar 17, 2018

@jreback
Contributor

no this was disallowed as it’s wrong - you can broadcast a scalar but not a list (when u specify and index) - you have to raise here as you don’t know what’s correct

jorisvandenbossche

jorisvandenbossche commented on Mar 17, 2018

@jorisvandenbossche
MemberAuthor

hmm, yeah, I can of course easily in geopandas create a list with the correct length if the index is specified, instead of always a list of 1 element

jorisvandenbossche

jorisvandenbossche commented on Mar 17, 2018

@jorisvandenbossche
MemberAuthor

Given we apparently relied on this behaviour on purpose for geopandas, I added a notice of this change to the API changes: #20392

jreback

jreback commented on Mar 17, 2018

@jreback
Contributor
jorisvandenbossche

jorisvandenbossche commented on Mar 17, 2018

@jorisvandenbossche
MemberAuthor

That's another issue (specific about Categorical, and in the bug fixes section). At least that was the actual bug being solved, the thing I raise here was a side-effect of the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @jreback@jorisvandenbossche

        Issue actions

          API/REGR: construction of Series with scalar-like / len-1 lists · Issue #20391 · pandas-dev/pandas