Skip to content

Constructing a df with readonly array of Periods fails #25403

Closed
@max-sixty

Description

@max-sixty
Contributor

Code Sample, a copy-pastable example if possible

In [1]: import pandas as pd

In [2]: import numpy as np

In [6]: pa = pd.PeriodIndex([pd.Period('2019-01-01')]).to_numpy()


In [8]: pa.setflags(write=False)

In [9]: pd.DataFrame(dict(date=pa, x=[1]))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-9-7a4f1ee8be64> in <module>()
----> 1 pd.DataFrame(dict(date=pa, x=[1]))

/usr/local/lib/python2.7/site-packages/pandas/core/frame.pyc in __init__(self, data, index, columns, dtype, copy)
    390                                  dtype=dtype, copy=copy)
    391         elif isinstance(data, dict):
--> 392             mgr = init_dict(data, index, columns, dtype=dtype)
    393         elif isinstance(data, ma.MaskedArray):
    394             import numpy.ma.mrecords as mrecords

/usr/local/lib/python2.7/site-packages/pandas/core/internals/construction.pyc in init_dict(data, index, columns, dtype)
    210         arrays = [data[k] for k in keys]
    211
--> 212     return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    213
    214

/usr/local/lib/python2.7/site-packages/pandas/core/internals/construction.pyc in arrays_to_mgr(arrays, arr_names, index, columns, dtype)
     54
     55     # don't force copy because getting jammed in an ndarray anyway
---> 56     arrays = _homogenize(arrays, index, dtype)
     57
     58     # from BlockManager perspective

/usr/local/lib/python2.7/site-packages/pandas/core/internals/construction.pyc in _homogenize(data, index, dtype)
    275                 val = lib.fast_multiget(val, oindex.values, default=np.nan)
    276             val = sanitize_array(val, index, dtype=dtype, copy=False,
--> 277                                  raise_cast_failure=False)
    278
    279         homogenized.append(val)

/usr/local/lib/python2.7/site-packages/pandas/core/internals/construction.pyc in sanitize_array(data, index, dtype, copy, raise_cast_failure)
    675         if inferred == 'period':
    676             try:
--> 677                 subarr = period_array(subarr)
    678             except IncompatibleFrequency:
    679                 pass

/usr/local/lib/python2.7/site-packages/pandas/core/arrays/period.pyc in period_array(data, freq, copy)
    786     data = ensure_object(data)
    787
--> 788     return PeriodArray._from_sequence(data, dtype=dtype)
    789
    790

/usr/local/lib/python2.7/site-packages/pandas/core/arrays/period.pyc in _from_sequence(cls, scalars, dtype, copy)
    197             periods = periods.copy()
    198
--> 199         freq = freq or libperiod.extract_freq(periods)
    200         ordinals = libperiod.extract_ordinals(periods, freq)
    201         return cls(ordinals, freq=freq)

pandas/_libs/tslibs/period.pyx in pandas._libs.tslibs.period.extract_freq()

/usr/local/lib/python2.7/site-packages/pandas/_libs/tslibs/period.so in View.MemoryView.memoryview_cwrapper()

/usr/local/lib/python2.7/site-packages/pandas/_libs/tslibs/period.so in View.MemoryView.memoryview.__cinit__()

ValueError: buffer source array is read-only

Problem description

I'm currently seeing this issue because xarray is constructing a read-only array of Periods in a .to_dataframe() call.

If pandas isn't expected to support these arrays, we could attempt to prevent this in xarray

Expected Output

Output of pd.show_versions()

In [10]: pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.15.final.0
python-bits: 64
OS: Darwin
OS-release: 18.2.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None

pandas: 0.24.1
pytest: 4.2.0
pip: 19.0.3
setuptools: 40.6.3
Cython: 0.29.5
numpy: 1.16.1
scipy: 1.2.1
pyarrow: 0.12.0
xarray: 0.11.3
IPython: 5.8.0
sphinx: None
patsy: 0.5.1
dateutil: 2.8.0
pytz: 2018.9
blosc: None
bottleneck: 1.2.1
tables: None
numexpr: 2.6.9
feather: None
matplotlib: 2.2.3
openpyxl: None
xlrd: 1.2.0
xlwt: 1.3.0
xlsxwriter: None
lxml.etree: 4.3.1
bs4: 4.7.1
html5lib: None
sqlalchemy: 1.2.17
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: 0.9.0
pandas_datareader: 0.7.0
gcsfs: None

Activity

jreback

jreback commented on Feb 21, 2019

@jreback
Contributor

sounds like the same only cython issue that we had with take
cc @jbrockmendel

max-sixty

max-sixty commented on Feb 21, 2019

@max-sixty
ContributorAuthor

For reference, we're creating the readonly array with np.broadcast_to here: https://github.com/pydata/xarray/blob/master/xarray/core/variable.py#L1199

jbrockmendel

jbrockmendel commented on Feb 22, 2019

@jbrockmendel
Member

Easy fix: tslibs.period.extract_freq needs to be changed from def extract_freq(object[:] values): to def extract_freq(ndarray[object] values):. If cython ever supports const object[:] that will be the long-term solution.

added
IndexRelated to the Index class or subclasses
on Feb 23, 2019
added this to the 0.24.2 milestone on Mar 4, 2019
jorisvandenbossche

jorisvandenbossche commented on Mar 4, 2019

@jorisvandenbossche
Member

If this is an easy fix, let's do it for 0.24.2, as this is a regression.

added a commit that references this issue on Mar 5, 2019
0dae2b3
added a commit that references this issue on Mar 7, 2019
b72e7ed
added a commit that references this issue on Mar 9, 2019
dcf7137
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIndexRelated to the Index class or subclassesRegressionFunctionality that used to work in a prior pandas version

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @jreback@jorisvandenbossche@max-sixty@jbrockmendel@gfyoung

        Issue actions

          Constructing a df with readonly array of Periods fails · Issue #25403 · pandas-dev/pandas