Description
Code Sample, a copy-pastable example if possible
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(3, 4), index=['A', 'B', 'C'])
df.to_pickle('out.zip')
#pd.read_pickle('out.zip')
Problem description
The below exception occurs. I do have writing permissions in the working directory. The code was working for pandas 0.19.0.
No problems observed for bz2 and gzip compression (xz I haven't tested).
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/generic.py", line 1378, in to_pickle
df.to_pickle('tmp.zip')
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/io/pickle.py", line 27, in to_pickle
is_text=False)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/io/common.py", line 352, in _get_handle
zip_file = zipfile.ZipFile(path_or_buf)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/zipfile.py", line 756, in __init__
self.fp = open(file, modeDict[mode])
IOError: [Errno 2] No such file or directory: 'out.zip'
Expected Output
A zip file that one can re-read with pandas.read_pickle()
.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Darwin
OS-release: 15.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.20.3
pytest: None
pip: 9.0.1
setuptools: 36.2.7
Cython: 0.26
numpy: 1.14.0.dev0+029863e
scipy: 0.18.1
xarray: None
IPython: 5.4.1
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None
Activity
[-]DataFrame.to_pickle() fails for .zip format on MacOS and pandas 20.3[/-][+]DataFrame.to_pickle() fails for .zip format on MacOS and pandas 0.20.3[/+]normanius commentedon Oct 4, 2017
The problem is located in
_get_handle()
of modulepandas.io.common
:With this code, the zip file is opened only for reading, and not for writing. Argument
mode
certainly should be used somewhere.chris-b1 commentedon Oct 4, 2017
Yep, problem does seem to be not passing the correct mode, PR to fix welcome!
masongallo commentedon Oct 4, 2017
It looks like the code for zip was written only for reading? Why not use gzip to write a single zip file?
Update common.py
s4chin commentedon Oct 12, 2017
Can I try this? I'm looking for a first issue as an entry point.
chris-b1 commentedon Oct 12, 2017
Yes, go ahead!
s4chin commentedon Oct 13, 2017
mode is
'wb'
when writing to the zipfile.zipfile.Zipfile
only accepts'a'
,'r'
,'w'
as modes, hence'wb'
needs to be converted to'w'
.After doing this, it gives me
So I just took out the
if ... elif ... else
part out and didf = zipfile.ZipFile(path_or_buf, 'w')
which results inAny pointers on how to move ahead? As @masongallo said, the code looks like it was meant only for reading.
normanius commentedon Oct 13, 2017
When I looked at it, I didn't find a straightforward way of doing it. The problem is that
io.common._get_handle()
needs to create an object with a file-like interface (read, write, open) to which you can later write strings/bytes.zipfile.ZipFile
represents more a container for files than a container for strings, so not sure if it can be used like a normal file-handle.Maybe one can construct something around
ZipFile.writestr()
that takes bytes instead of files to write into the zip file. This won't give you a file-handle or anything, but maybe you can tinker one using some functools or StringIO. But for this one needs to understand where the file-handle is used etc.Alternatively follow up on @masongallo comment regarding gzip?
2 remaining items
Merge branch 'master' into pandas-dev#17778
Merge branch 'master' into pandas-dev#17778
Merge branch 'master' into pandas-dev#17778
Merge branch 'master' into pandas-dev#17778
minggli commentedon Mar 17, 2018
Hi @jreback ,
Will try to fix this issue if it hasn't been fixed since last conversation. Reverting.
Thanks,
Ming
to_pickle
,to_json
,to_csv
#20394