Skip to content

Commit 030f613

Browse files
committedJul 2, 2013
Merge pull request #4092 from jtratner/refactor_string_special_methods
CLN: Refactor string special methods
2 parents a16f243 + a558314 commit 030f613

File tree

24 files changed

+201
-310
lines changed

24 files changed

+201
-310
lines changed
 

‎doc/source/release.rst

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -175,8 +175,18 @@ pandas 0.12
175175
``bs4`` + ``html5lib`` when lxml fails to parse. a list of parsers to try
176176
until success is also valid
177177
- more consistency in the to_datetime return types (give string/array of string inputs) (:issue:`3888`)
178+
- The internal ``pandas`` class hierarchy has changed (slightly). The
179+
previous ``PandasObject`` now is called ``PandasContainer`` and a new
180+
``PandasObject`` has become the baseclass for ``PandasContainer`` as well
181+
as ``Index``, ``Categorical``, ``GroupBy``, ``SparseList``, and
182+
``SparseArray`` (+ their base classes). Currently, ``PandasObject``
183+
provides string methods (from ``StringMixin``). (:issue:`4090`, :issue:`4092`)
184+
- New ``StringMixin`` that, given a ``__unicode__`` method, gets python 2 and
185+
python 3 compatible string methods (``__str__``, ``__bytes__``, and
186+
``__repr__``). Plus string safety throughout. Now employed in many places
187+
throughout the pandas library. (:issue:`4090`, :issue:`4092`)
178188

179-
**Experimental Feautres**
189+
**Experimental Features**
180190

181191
- Added experimental ``CustomBusinessDay`` class to support ``DateOffsets``
182192
with custom holiday calendars and custom weekmasks. (:issue:`2301`)

‎doc/source/v0.12.0.txt

Lines changed: 33 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,13 @@ enhancements along with a large number of bug fixes.
88

99
Highlites include a consistent I/O API naming scheme, routines to read html,
1010
write multi-indexes to csv files, read & write STATA data files, read & write JSON format
11-
files, Python 3 support for ``HDFStore``, filtering of groupby expressions via ``filter``, and a
11+
files, Python 3 support for ``HDFStore``, filtering of groupby expressions via ``filter``, and a
1212
revamped ``replace`` routine that accepts regular expressions.
1313

1414
API changes
1515
~~~~~~~~~~~
1616

17-
- The I/O API is now much more consistent with a set of top level ``reader`` functions
17+
- The I/O API is now much more consistent with a set of top level ``reader`` functions
1818
accessed like ``pd.read_csv()`` that generally return a ``pandas`` object.
1919

2020
* ``read_csv``
@@ -38,7 +38,7 @@ API changes
3838
* ``to_clipboard``
3939

4040

41-
- Fix modulo and integer division on Series,DataFrames to act similary to ``float`` dtypes to return
41+
- Fix modulo and integer division on Series,DataFrames to act similary to ``float`` dtypes to return
4242
``np.nan`` or ``np.inf`` as appropriate (:issue:`3590`). This correct a numpy bug that treats ``integer``
4343
and ``float`` dtypes differently.
4444

@@ -50,15 +50,15 @@ API changes
5050
p / p
5151
p / 0
5252

53-
- Add ``squeeze`` keyword to ``groupby`` to allow reduction from
53+
- Add ``squeeze`` keyword to ``groupby`` to allow reduction from
5454
DataFrame -> Series if groups are unique. This is a Regression from 0.10.1.
55-
We are reverting back to the prior behavior. This means groupby will return the
56-
same shaped objects whether the groups are unique or not. Revert this issue (:issue:`2893`)
55+
We are reverting back to the prior behavior. This means groupby will return the
56+
same shaped objects whether the groups are unique or not. Revert this issue (:issue:`2893`)
5757
with (:issue:`3596`).
5858

5959
.. ipython:: python
6060

61-
df2 = DataFrame([{"val1": 1, "val2" : 20}, {"val1":1, "val2": 19},
61+
df2 = DataFrame([{"val1": 1, "val2" : 20}, {"val1":1, "val2": 19},
6262
{"val1":1, "val2": 27}, {"val1":1, "val2": 12}])
6363
def func(dataf):
6464
return dataf["val2"] - dataf["val2"].mean()
@@ -96,9 +96,9 @@ API changes
9696
and thus you should cast to an appropriate numeric dtype if you need to
9797
plot something.
9898

99-
- Add ``colormap`` keyword to DataFrame plotting methods. Accepts either a
100-
matplotlib colormap object (ie, matplotlib.cm.jet) or a string name of such
101-
an object (ie, 'jet'). The colormap is sampled to select the color for each
99+
- Add ``colormap`` keyword to DataFrame plotting methods. Accepts either a
100+
matplotlib colormap object (ie, matplotlib.cm.jet) or a string name of such
101+
an object (ie, 'jet'). The colormap is sampled to select the color for each
102102
column. Please see :ref:`visualization.colormaps` for more information.
103103
(:issue:`3860`)
104104

@@ -159,6 +159,18 @@ API changes
159159
``bs4`` + ``html5lib`` when lxml fails to parse. a list of parsers to try
160160
until success is also valid
161161

162+
- The internal ``pandas`` class hierarchy has changed (slightly). The
163+
previous ``PandasObject`` now is called ``PandasContainer`` and a new
164+
``PandasObject`` has become the baseclass for ``PandasContainer`` as well
165+
as ``Index``, ``Categorical``, ``GroupBy``, ``SparseList``, and
166+
``SparseArray`` (+ their base classes). Currently, ``PandasObject``
167+
provides string methods (from ``StringMixin``). (:issue:`4090`, :issue:`4092`)
168+
169+
- New ``StringMixin`` that, given a ``__unicode__`` method, gets python 2 and
170+
python 3 compatible string methods (``__str__``, ``__bytes__``, and
171+
``__repr__``). Plus string safety throughout. Now employed in many places
172+
throughout the pandas library. (:issue:`4090`, :issue:`4092`)
173+
162174
I/O Enhancements
163175
~~~~~~~~~~~~~~~~
164176

@@ -184,7 +196,7 @@ I/O Enhancements
184196

185197
.. warning::
186198

187-
You may have to install an older version of BeautifulSoup4,
199+
You may have to install an older version of BeautifulSoup4,
188200
:ref:`See the installation docs<install.optional_dependencies>`
189201

190202
- Added module for reading and writing Stata files: ``pandas.io.stata`` (:issue:`1512`)
@@ -203,15 +215,15 @@ I/O Enhancements
203215
- The option, ``tupleize_cols`` can now be specified in both ``to_csv`` and
204216
``read_csv``, to provide compatiblity for the pre 0.12 behavior of
205217
writing and reading multi-index columns via a list of tuples. The default in
206-
0.12 is to write lists of tuples and *not* interpret list of tuples as a
207-
multi-index column.
218+
0.12 is to write lists of tuples and *not* interpret list of tuples as a
219+
multi-index column.
208220

209221
Note: The default behavior in 0.12 remains unchanged, but starting with 0.13,
210-
the default *to* write and read multi-index columns will be in the new
222+
the default *to* write and read multi-index columns will be in the new
211223
format. (:issue:`3571`, :issue:`1651`, :issue:`3141`)
212224

213225
- If an ``index_col`` is not specified (e.g. you don't have an index, or wrote it
214-
with ``df.to_csv(..., index=False``), then any ``names`` on the columns index will
226+
with ``df.to_csv(..., index=False``), then any ``names`` on the columns index will
215227
be *lost*.
216228

217229
.. ipython:: python
@@ -296,8 +308,8 @@ Other Enhancements
296308
pd.get_option('a.b')
297309
pd.get_option('b.c')
298310

299-
- The ``filter`` method for group objects returns a subset of the original
300-
object. Suppose we want to take only elements that belong to groups with a
311+
- The ``filter`` method for group objects returns a subset of the original
312+
object. Suppose we want to take only elements that belong to groups with a
301313
group sum greater than 2.
302314

303315
.. ipython:: python
@@ -317,7 +329,7 @@ Other Enhancements
317329
dff.groupby('B').filter(lambda x: len(x) > 2)
318330

319331
Alternatively, instead of dropping the offending groups, we can return a
320-
like-indexed objects where the groups that do not pass the filter are
332+
like-indexed objects where the groups that do not pass the filter are
321333
filled with NaNs.
322334

323335
.. ipython:: python
@@ -333,9 +345,9 @@ Experimental Features
333345

334346
- Added experimental ``CustomBusinessDay`` class to support ``DateOffsets``
335347
with custom holiday calendars and custom weekmasks. (:issue:`2301`)
336-
348+
337349
.. note::
338-
350+
339351
This uses the ``numpy.busdaycalendar`` API introduced in Numpy 1.7 and
340352
therefore requires Numpy 1.7.0 or newer.
341353

@@ -416,7 +428,7 @@ Bug Fixes
416428
- Extend ``reindex`` to correctly deal with non-unique indices (:issue:`3679`)
417429
- ``DataFrame.itertuples()`` now works with frames with duplicate column
418430
names (:issue:`3873`)
419-
- Bug in non-unique indexing via ``iloc`` (:issue:`4017`); added ``takeable`` argument to
431+
- Bug in non-unique indexing via ``iloc`` (:issue:`4017`); added ``takeable`` argument to
420432
``reindex`` for location-based taking
421433

422434
- ``DataFrame.from_records`` did not accept empty recarrays (:issue:`3682`)

‎pandas/core/base.py

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
"""
2+
Base class(es) for all pandas objects.
3+
"""
4+
from pandas.util import py3compat
5+
6+
class StringMixin(object):
7+
"""implements string methods so long as object defines a `__unicode__` method.
8+
Handles Python2/3 compatibility transparently."""
9+
# side note - this could be made into a metaclass if more than one object nees
10+
def __str__(self):
11+
"""
12+
Return a string representation for a particular object.
13+
14+
Invoked by str(obj) in both py2/py3.
15+
Yields Bytestring in Py2, Unicode String in py3.
16+
"""
17+
18+
if py3compat.PY3:
19+
return self.__unicode__()
20+
return self.__bytes__()
21+
22+
def __bytes__(self):
23+
"""
24+
Return a string representation for a particular object.
25+
26+
Invoked by bytes(obj) in py3 only.
27+
Yields a bytestring in both py2/py3.
28+
"""
29+
from pandas.core.config import get_option
30+
31+
encoding = get_option("display.encoding")
32+
return self.__unicode__().encode(encoding, 'replace')
33+
34+
def __repr__(self):
35+
"""
36+
Return a string representation for a particular object.
37+
38+
Yields Bytestring in Py2, Unicode String in py3.
39+
"""
40+
return str(self)
41+
42+
class PandasObject(StringMixin):
43+
"""baseclass for various pandas objects"""
44+
45+
@property
46+
def _constructor(self):
47+
"""class constructor (for this class it's just `__class__`"""
48+
return self.__class__
49+
50+
def __unicode__(self):
51+
"""
52+
Return a string representation for a particular object.
53+
54+
Invoked by unicode(obj) in py2 only. Yields a Unicode String in both
55+
py2/py3.
56+
"""
57+
# Should be overwritten by base classes
58+
return object.__repr__(self)

‎pandas/core/categorical.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
import numpy as np
44

55
from pandas.core.algorithms import factorize
6+
from pandas.core.base import PandasObject
67
from pandas.core.index import Index
78
import pandas.core.common as com
89
from pandas.core.frame import DataFrame
@@ -25,8 +26,7 @@ def f(self, other):
2526

2627
return f
2728

28-
29-
class Categorical(object):
29+
class Categorical(PandasObject):
3030
"""
3131
Represents a categorical variable in classic R / S-plus fashion
3232
@@ -134,9 +134,9 @@ def __array__(self, dtype=None):
134134
def __len__(self):
135135
return len(self.labels)
136136

137-
def __repr__(self):
137+
def __unicode__(self):
138138
temp = 'Categorical: %s\n%s\n%s'
139-
values = np.asarray(self)
139+
values = com.pprint_thing(np.asarray(self))
140140
levheader = 'Levels (%d): ' % len(self.levels)
141141
levstring = np.array_repr(self.levels,
142142
max_line_width=60)
@@ -145,9 +145,9 @@ def __repr__(self):
145145
lines = levstring.split('\n')
146146
levstring = '\n'.join([lines[0]] +
147147
[indent + x.lstrip() for x in lines[1:]])
148+
name = '' if self.name is None else self.name
149+
return temp % (name, values, levheader + levstring)
148150

149-
return temp % ('' if self.name is None else self.name,
150-
repr(values), levheader + levstring)
151151

152152
def __getitem__(self, key):
153153
if isinstance(key, (int, np.integer)):

‎pandas/core/common.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -64,10 +64,10 @@ def _isnull_new(obj):
6464
if lib.isscalar(obj):
6565
return lib.checknull(obj)
6666

67-
from pandas.core.generic import PandasObject
67+
from pandas.core.generic import PandasContainer
6868
if isinstance(obj, np.ndarray):
6969
return _isnull_ndarraylike(obj)
70-
elif isinstance(obj, PandasObject):
70+
elif isinstance(obj, PandasContainer):
7171
# TODO: optimize for DataFrame, etc.
7272
return obj.apply(isnull)
7373
elif isinstance(obj, list) or hasattr(obj, '__array__'):
@@ -91,10 +91,10 @@ def _isnull_old(obj):
9191
if lib.isscalar(obj):
9292
return lib.checknull_old(obj)
9393

94-
from pandas.core.generic import PandasObject
94+
from pandas.core.generic import PandasContainer
9595
if isinstance(obj, np.ndarray):
9696
return _isnull_ndarraylike_old(obj)
97-
elif isinstance(obj, PandasObject):
97+
elif isinstance(obj, PandasContainer):
9898
# TODO: optimize for DataFrame, etc.
9999
return obj.apply(_isnull_old)
100100
elif isinstance(obj, list) or hasattr(obj, '__array__'):

‎pandas/core/frame.py

Lines changed: 0 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -584,10 +584,6 @@ def _verbose_info(self, value):
584584
def axes(self):
585585
return [self.index, self.columns]
586586

587-
@property
588-
def _constructor(self):
589-
return self.__class__
590-
591587
@property
592588
def shape(self):
593589
return (len(self.index), len(self.columns))
@@ -653,28 +649,6 @@ def _repr_fits_horizontal_(self,ignore_width=False):
653649

654650
return repr_width < width
655651

656-
def __str__(self):
657-
"""
658-
Return a string representation for a particular DataFrame
659-
660-
Invoked by str(df) in both py2/py3.
661-
Yields Bytestring in Py2, Unicode String in py3.
662-
"""
663-
664-
if py3compat.PY3:
665-
return self.__unicode__()
666-
return self.__bytes__()
667-
668-
def __bytes__(self):
669-
"""
670-
Return a string representation for a particular DataFrame
671-
672-
Invoked by bytes(df) in py3 only.
673-
Yields a bytestring in both py2/py3.
674-
"""
675-
encoding = com.get_option("display.encoding")
676-
return self.__unicode__().encode(encoding, 'replace')
677-
678652
def __unicode__(self):
679653
"""
680654
Return a string representation for a particular DataFrame
@@ -714,14 +688,6 @@ def __unicode__(self):
714688

715689
return value
716690

717-
def __repr__(self):
718-
"""
719-
Return a string representation for a particular DataFrame
720-
721-
Yields Bytestring in Py2, Unicode String in py3.
722-
"""
723-
return str(self)
724-
725691
def _repr_html_(self):
726692
"""
727693
Return a html representation for a particular DataFrame.

‎pandas/core/generic.py

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,21 @@
11
# pylint: disable=W0231,E1101
22

33
import numpy as np
4+
import pandas.lib as lib
5+
from pandas.core.base import PandasObject
46

57
from pandas.core.index import MultiIndex
68
import pandas.core.indexing as indexing
79
from pandas.core.indexing import _maybe_convert_indices
810
from pandas.tseries.index import DatetimeIndex
911
import pandas.core.common as com
10-
import pandas.lib as lib
1112

1213

1314
class PandasError(Exception):
1415
pass
1516

1617

17-
class PandasObject(object):
18+
class PandasContainer(PandasObject):
1819

1920
_AXIS_NUMBERS = {
2021
'index': 0,
@@ -52,6 +53,12 @@ def __hash__(self):
5253
raise TypeError('{0!r} objects are mutable, thus they cannot be'
5354
' hashed'.format(self.__class__.__name__))
5455

56+
def __unicode__(self):
57+
# unicode representation based upon iterating over self
58+
# (since, by definition, `PandasContainers` are iterable)
59+
prepr = '[%s]' % ','.join(map(com.pprint_thing, self))
60+
return '%s(%s)' % (self.__class__.__name__, prepr)
61+
5562

5663
#----------------------------------------------------------------------
5764
# Axis name business
@@ -578,9 +585,10 @@ def to_json(self, path_or_buf=None, orient=None, date_format='epoch',
578585

579586
# install the indexerse
580587
for _name, _indexer in indexing.get_indexers_list():
581-
PandasObject._create_indexer(_name,_indexer)
588+
PandasContainer._create_indexer(_name,_indexer)
589+
582590

583-
class NDFrame(PandasObject):
591+
class NDFrame(PandasContainer):
584592
"""
585593
N-dimensional analogue of DataFrame. Store multi-dimensional in a
586594
size-mutable, labeled data structure
@@ -625,17 +633,10 @@ def astype(self, dtype, copy = True, raise_on_error = True):
625633
mgr = self._data.astype(dtype, copy = copy, raise_on_error = raise_on_error)
626634
return self._constructor(mgr)
627635

628-
@property
629-
def _constructor(self):
630-
return NDFrame
631-
632636
@property
633637
def axes(self):
634638
return self._data.axes
635639

636-
def __repr__(self):
637-
return 'NDFrame'
638-
639640
@property
640641
def values(self):
641642
return self._data.as_matrix()

‎pandas/core/groupby.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
import types
33
import numpy as np
44

5+
from pandas.core.base import PandasObject
56
from pandas.core.categorical import Categorical
67
from pandas.core.frame import DataFrame
78
from pandas.core.generic import NDFrame
@@ -100,7 +101,7 @@ def _last(x):
100101
return _last(x)
101102

102103

103-
class GroupBy(object):
104+
class GroupBy(PandasObject):
104105
"""
105106
Class for grouping and aggregating relational data. See aggregate,
106107
transform, and apply functions on this object.
@@ -201,6 +202,10 @@ def __init__(self, obj, keys=None, axis=0, level=None,
201202
def __len__(self):
202203
return len(self.indices)
203204

205+
def __unicode__(self):
206+
# TODO: Better unicode/repr for GroupBy object
207+
return object.__repr__(self)
208+
204209
@property
205210
def groups(self):
206211
return self.grouper.groups

‎pandas/core/index.py

Lines changed: 2 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
import pandas.algos as _algos
1010
import pandas.index as _index
1111
from pandas.lib import Timestamp
12+
from pandas.core.base import PandasObject
1213

1314
from pandas.util.decorators import cache_readonly
1415
from pandas.core.common import isnull
@@ -47,7 +48,7 @@ def _shouldbe_timestamp(obj):
4748
or tslib.is_timestamp_array(obj))
4849

4950

50-
class Index(np.ndarray):
51+
class Index(PandasObject, np.ndarray):
5152
"""
5253
Immutable ndarray implementing an ordered, sliceable set. The basic object
5354
storing axis labels for all pandas objects
@@ -142,28 +143,6 @@ def __array_finalize__(self, obj):
142143
def _shallow_copy(self):
143144
return self.view()
144145

145-
def __str__(self):
146-
"""
147-
Return a string representation for a particular Index
148-
149-
Invoked by str(df) in both py2/py3.
150-
Yields Bytestring in Py2, Unicode String in py3.
151-
"""
152-
153-
if py3compat.PY3:
154-
return self.__unicode__()
155-
return self.__bytes__()
156-
157-
def __bytes__(self):
158-
"""
159-
Return a string representation for a particular Index
160-
161-
Invoked by bytes(df) in py3 only.
162-
Yields a bytestring in both py2/py3.
163-
"""
164-
encoding = com.get_option("display.encoding")
165-
return self.__unicode__().encode(encoding, 'replace')
166-
167146
def __unicode__(self):
168147
"""
169148
Return a string representation for a particular Index
@@ -173,14 +152,6 @@ def __unicode__(self):
173152
prepr = com.pprint_thing(self, escape_chars=('\t', '\r', '\n'),quote_strings=True)
174153
return '%s(%s, dtype=%s)' % (type(self).__name__, prepr, self.dtype)
175154

176-
def __repr__(self):
177-
"""
178-
Return a string representation for a particular Index
179-
180-
Yields Bytestring in Py2, Unicode String in py3.
181-
"""
182-
return str(self)
183-
184155
def to_series(self):
185156
"""
186157
return a series with both index and values equal to the index keys
@@ -237,10 +208,6 @@ def _set_names(self, values):
237208

238209
names = property(fset=_set_names, fget=_get_names)
239210

240-
@property
241-
def _constructor(self):
242-
return Index
243-
244211
@property
245212
def _has_complex_internals(self):
246213
# to disable groupby tricks in MultiIndex
@@ -1408,10 +1375,6 @@ def __new__(cls, data, dtype=None, copy=False, name=None):
14081375
def inferred_type(self):
14091376
return 'integer'
14101377

1411-
@property
1412-
def _constructor(self):
1413-
return Int64Index
1414-
14151378
@property
14161379
def asi8(self):
14171380
# do not cache or you'll create a memory leak
@@ -1531,28 +1494,6 @@ def _array_values(self):
15311494
def dtype(self):
15321495
return np.dtype('O')
15331496

1534-
def __str__(self):
1535-
"""
1536-
Return a string representation for a particular Index
1537-
1538-
Invoked by str(df) in both py2/py3.
1539-
Yields Bytestring in Py2, Unicode String in py3.
1540-
"""
1541-
1542-
if py3compat.PY3:
1543-
return self.__unicode__()
1544-
return self.__bytes__()
1545-
1546-
def __bytes__(self):
1547-
"""
1548-
Return a string representation for a particular Index
1549-
1550-
Invoked by bytes(df) in py3 only.
1551-
Yields a bytestring in both py2/py3.
1552-
"""
1553-
encoding = com.get_option("display.encoding")
1554-
return self.__unicode__().encode(encoding, 'replace')
1555-
15561497
def __unicode__(self):
15571498
"""
15581499
Return a string representation for a particular Index
@@ -1566,14 +1507,6 @@ def __unicode__(self):
15661507

15671508
return output % summary
15681509

1569-
def __repr__(self):
1570-
"""
1571-
Return a string representation for a particular Index
1572-
1573-
Yields Bytestring in Py2, Unicode String in py3.
1574-
"""
1575-
return str(self)
1576-
15771510
def __len__(self):
15781511
return len(self.labels[0])
15791512

‎pandas/core/internals.py

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44

55
from numpy import nan
66
import numpy as np
7+
from pandas.core.base import PandasObject
78

89
from pandas.core.common import (_possibly_downcast_to_dtype, isnull, _NS_DTYPE,
910
_TD_DTYPE)
@@ -19,7 +20,7 @@
1920
from pandas.util import py3compat
2021

2122

22-
class Block(object):
23+
class Block(PandasObject):
2324
"""
2425
Canonical n-dimensional unit of homogeneous dtype contained in a pandas
2526
data structure
@@ -91,14 +92,12 @@ def set_ref_items(self, ref_items, maybe_rename=True):
9192
self.items = ref_items.take(self.ref_locs)
9293
self.ref_items = ref_items
9394

94-
def __repr__(self):
95+
def __unicode__(self):
9596
shape = ' x '.join([com.pprint_thing(s) for s in self.shape])
9697
name = type(self).__name__
9798
result = '%s: %s, %s, dtype %s' % (
9899
name, com.pprint_thing(self.items), shape, self.dtype)
99-
if py3compat.PY3:
100-
return unicode(result)
101-
return com.console_encode(result)
100+
return result
102101

103102
def __contains__(self, item):
104103
return item in self.items
@@ -969,7 +968,7 @@ def make_block(values, items, ref_items, klass=None, fastpath=False, placement=N
969968
# TODO: flexible with index=None and/or items=None
970969

971970

972-
class BlockManager(object):
971+
class BlockManager(PandasObject):
973972
"""
974973
Core internal data structure to implement DataFrame
975974
@@ -1213,7 +1212,7 @@ def __setstate__(self, state):
12131212
def __len__(self):
12141213
return len(self.items)
12151214

1216-
def __repr__(self):
1215+
def __unicode__(self):
12171216
output = 'BlockManager'
12181217
for i, ax in enumerate(self.axes):
12191218
if i == 0:
@@ -1222,7 +1221,7 @@ def __repr__(self):
12221221
output += '\nAxis %d: %s' % (i, ax)
12231222

12241223
for block in self.blocks:
1225-
output += '\n%s' % repr(block)
1224+
output += '\n%s' % com.pprint_thing(block)
12261225
return output
12271226

12281227
@property

‎pandas/core/panel.py

Lines changed: 0 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -186,10 +186,6 @@ class Panel(NDFrame):
186186
major_axis = lib.AxisProperty(1)
187187
minor_axis = lib.AxisProperty(2)
188188

189-
@property
190-
def _constructor(self):
191-
return type(self)
192-
193189
# return the type of the slice constructor
194190
_constructor_sliced = DataFrame
195191

@@ -466,28 +462,6 @@ def __invert__(self):
466462
#----------------------------------------------------------------------
467463
# Magic methods
468464

469-
def __str__(self):
470-
"""
471-
Return a string representation for a particular Panel
472-
473-
Invoked by str(df) in both py2/py3.
474-
Yields Bytestring in Py2, Unicode String in py3.
475-
"""
476-
477-
if py3compat.PY3:
478-
return self.__unicode__()
479-
return self.__bytes__()
480-
481-
def __bytes__(self):
482-
"""
483-
Return a string representation for a particular Panel
484-
485-
Invoked by bytes(df) in py3 only.
486-
Yields a bytestring in both py2/py3.
487-
"""
488-
encoding = com.get_option("display.encoding")
489-
return self.__unicode__().encode(encoding, 'replace')
490-
491465
def __unicode__(self):
492466
"""
493467
Return a string representation for a particular Panel
@@ -515,14 +489,6 @@ def axis_pretty(a):
515489
[class_name, dims] + [axis_pretty(a) for a in self._AXIS_ORDERS])
516490
return output
517491

518-
def __repr__(self):
519-
"""
520-
Return a string representation for a particular Panel
521-
522-
Yields Bytestring in Py2, Unicode String in py3.
523-
"""
524-
return str(self)
525-
526492
def __iter__(self):
527493
return iter(getattr(self, self._info_axis))
528494

‎pandas/core/series.py

Lines changed: 1 addition & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -394,8 +394,7 @@ def f(self, axis=0, dtype=None, out=None, skipna=True, level=None):
394394
#----------------------------------------------------------------------
395395
# Series class
396396

397-
398-
class Series(pa.Array, generic.PandasObject):
397+
class Series(generic.PandasContainer, pa.Array):
399398
"""
400399
One-dimensional ndarray with axis labels (including time series).
401400
Labels need not be unique but must be any hashable type. The object
@@ -520,10 +519,6 @@ def __init__(self, data=None, index=None, dtype=None, name=None,
520519
copy=False):
521520
pass
522521

523-
@property
524-
def _constructor(self):
525-
return Series
526-
527522
@property
528523
def _can_hold_na(self):
529524
return not is_integer_dtype(self.dtype)
@@ -1096,28 +1091,6 @@ def reset_index(self, level=None, drop=False, name=None, inplace=False):
10961091

10971092
return df.reset_index(level=level, drop=drop)
10981093

1099-
def __str__(self):
1100-
"""
1101-
Return a string representation for a particular DataFrame
1102-
1103-
Invoked by str(df) in both py2/py3.
1104-
Yields Bytestring in Py2, Unicode String in py3.
1105-
"""
1106-
1107-
if py3compat.PY3:
1108-
return self.__unicode__()
1109-
return self.__bytes__()
1110-
1111-
def __bytes__(self):
1112-
"""
1113-
Return a string representation for a particular DataFrame
1114-
1115-
Invoked by bytes(df) in py3 only.
1116-
Yields a bytestring in both py2/py3.
1117-
"""
1118-
encoding = com.get_option("display.encoding")
1119-
return self.__unicode__().encode(encoding, 'replace')
1120-
11211094
def __unicode__(self):
11221095
"""
11231096
Return a string representation for a particular DataFrame
@@ -1142,14 +1115,6 @@ def __unicode__(self):
11421115
raise AssertionError()
11431116
return result
11441117

1145-
def __repr__(self):
1146-
"""
1147-
Return a string representation for a particular Series
1148-
1149-
Yields Bytestring in Py2, Unicode String in py3.
1150-
"""
1151-
return str(self)
1152-
11531118
def _tidy_repr(self, max_vals=20):
11541119
"""
11551120

‎pandas/io/excel.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -73,9 +73,6 @@ def __init__(self, path_or_buf, kind=None, **kwds):
7373
data = path_or_buf.read()
7474
self.book = xlrd.open_workbook(file_contents=data)
7575

76-
def __repr__(self):
77-
return object.__repr__(self)
78-
7976
def parse(self, sheetname, header=0, skiprows=None, skip_footer=0,
8077
index_col=None, parse_cols=None, parse_dates=False,
8178
date_parser=None, na_values=None, thousands=None, chunksize=None,

‎pandas/io/pytables.py

Lines changed: 21 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,8 @@
1717
from pandas.sparse.api import SparseSeries, SparseDataFrame, SparsePanel
1818
from pandas.sparse.array import BlockIndex, IntIndex
1919
from pandas.tseries.api import PeriodIndex, DatetimeIndex
20-
from pandas.core.common import adjoin, is_list_like
20+
from pandas.core.base import StringMixin
21+
from pandas.core.common import adjoin, is_list_like, pprint_thing
2122
from pandas.core.algorithms import match, unique
2223
from pandas.core.categorical import Categorical
2324
from pandas.core.common import _asarray_tuplesafe
@@ -218,7 +219,7 @@ def read_hdf(path_or_buf, key, **kwargs):
218219
# a passed store; user controls open/close
219220
f(path_or_buf, False)
220221

221-
class HDFStore(object):
222+
class HDFStore(StringMixin):
222223
"""
223224
dict-like IO interface for storing pandas objects in PyTables
224225
format.
@@ -315,8 +316,8 @@ def __contains__(self, key):
315316
def __len__(self):
316317
return len(self.groups())
317318

318-
def __repr__(self):
319-
output = '%s\nFile path: %s\n' % (type(self), self._path)
319+
def __unicode__(self):
320+
output = '%s\nFile path: %s\n' % (type(self), pprint_thing(self._path))
320321

321322
if len(self.keys()):
322323
keys = []
@@ -326,11 +327,11 @@ def __repr__(self):
326327
try:
327328
s = self.get_storer(k)
328329
if s is not None:
329-
keys.append(str(s.pathname or k))
330-
values.append(str(s or 'invalid_HDFStore node'))
331-
except (Exception), detail:
330+
keys.append(pprint_thing(s.pathname or k))
331+
values.append(pprint_thing(s or 'invalid_HDFStore node'))
332+
except Exception as detail:
332333
keys.append(k)
333-
values.append("[invalid_HDFStore node: %s]" % str(detail))
334+
values.append("[invalid_HDFStore node: %s]" % pprint_thing(detail))
334335

335336
output += adjoin(12, keys, values)
336337
else:
@@ -984,7 +985,7 @@ def get_values(self):
984985
self.close()
985986
return results
986987

987-
class IndexCol(object):
988+
class IndexCol(StringMixin):
988989
""" an index column description class
989990
990991
Parameters
@@ -1050,10 +1051,9 @@ def set_table(self, table):
10501051
self.table = table
10511052
return self
10521053

1053-
def __repr__(self):
1054-
return "name->%s,cname->%s,axis->%s,pos->%s,kind->%s" % (self.name, self.cname, self.axis, self.pos, self.kind)
1055-
1056-
__str__ = __repr__
1054+
def __unicode__(self):
1055+
temp = tuple(map(pprint_thing, (self.name, self.cname, self.axis, self.pos, self.kind)))
1056+
return "name->%s,cname->%s,axis->%s,pos->%s,kind->%s" % temp
10571057

10581058
def __eq__(self, other):
10591059
""" compare 2 col items """
@@ -1570,7 +1570,7 @@ class GenericDataIndexableCol(DataIndexableCol):
15701570
def get_attr(self):
15711571
pass
15721572

1573-
class Storer(object):
1573+
class Storer(StringMixin):
15741574
""" represent an object in my store
15751575
facilitate read/write of various types of objects
15761576
this is an abstract base class
@@ -1610,19 +1610,16 @@ def set_version(self):
16101610
def pandas_type(self):
16111611
return _ensure_decoded(getattr(self.group._v_attrs, 'pandas_type', None))
16121612

1613-
def __repr__(self):
1614-
""" return a pretty representatgion of myself """
1613+
def __unicode__(self):
1614+
""" return a pretty representation of myself """
16151615
self.infer_axes()
16161616
s = self.shape
16171617
if s is not None:
16181618
if isinstance(s, (list,tuple)):
1619-
s = "[%s]" % ','.join([ str(x) for x in s ])
1619+
s = "[%s]" % ','.join([pprint_thing(x) for x in s])
16201620
return "%-12.12s (shape->%s)" % (self.pandas_type,s)
16211621
return self.pandas_type
16221622

1623-
def __str__(self):
1624-
return self.__repr__()
1625-
16261623
def set_object_info(self):
16271624
""" set my pandas type & version """
16281625
self.attrs.pandas_type = self.pandas_kind
@@ -3435,7 +3432,7 @@ def _need_convert(kind):
34353432
return True
34363433
return False
34373434

3438-
class Term(object):
3435+
class Term(StringMixin):
34393436
"""create a term object that holds a field, op, and value
34403437
34413438
Parameters
@@ -3540,10 +3537,9 @@ def __init__(self, field, op=None, value=None, queryables=None, encoding=None):
35403537
if len(self.q):
35413538
self.eval()
35423539

3543-
def __str__(self):
3544-
return "field->%s,op->%s,value->%s" % (self.field, self.op, self.value)
3545-
3546-
__repr__ = __str__
3540+
def __unicode__(self):
3541+
attrs = map(pprint_thing, (self.field, self.op, self.value))
3542+
return "field->%s,op->%s,value->%s" % tuple(attrs)
35473543

35483544
@property
35493545
def is_valid(self):

‎pandas/io/stata.py

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515

1616
import sys
1717
import struct
18+
from pandas.core.base import StringMixin
1819
from pandas.core.frame import DataFrame
1920
from pandas.core.series import Series
2021
from pandas.core.categorical import Categorical
@@ -163,7 +164,7 @@ def _datetime_to_stata_elapsed(date, fmt):
163164
raise ValueError("fmt %s not understood" % fmt)
164165

165166

166-
class StataMissingValue(object):
167+
class StataMissingValue(StringMixin):
167168
"""
168169
An observation's missing value.
169170
@@ -192,10 +193,12 @@ def __init__(self, offset, value):
192193
string = property(lambda self: self._str, doc="The Stata representation of the missing value: '.', '.a'..'.z'")
193194
value = property(lambda self: self._value, doc='The binary representation of the missing value.')
194195

195-
def __str__(self):
196-
return self._str
196+
def __unicode__(self):
197+
return self.string
197198

198-
__str__.__doc__ = string.__doc__
199+
def __repr__(self):
200+
# not perfect :-/
201+
return "%s(%s)" % (self.__class__, self)
199202

200203

201204
class StataParser(object):

‎pandas/io/tests/test_json/test_pandas.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@
3333

3434
_mixed_frame = _frame.copy()
3535

36-
class TestPandasObjects(unittest.TestCase):
36+
class TestPandasContainer(unittest.TestCase):
3737

3838
def setUp(self):
3939
self.ts = tm.makeTimeSeries()
@@ -68,7 +68,7 @@ def _check_orient(df, orient, dtype=None, numpy=False, convert_axes=True, check_
6868
if type(detail) == raise_ok:
6969
return
7070
raise
71-
71+
7272
unser = unser.sort()
7373

7474
if dtype is False:
@@ -104,7 +104,7 @@ def _check_all_orients(df, dtype=None, convert_axes=True, raise_ok=None):
104104
_check_orient(df, "split", dtype=dtype)
105105
_check_orient(df, "index", dtype=dtype)
106106
_check_orient(df, "values", dtype=dtype)
107-
107+
108108
_check_orient(df, "columns", dtype=dtype, convert_axes=False)
109109
_check_orient(df, "records", dtype=dtype, convert_axes=False)
110110
_check_orient(df, "split", dtype=dtype, convert_axes=False)
@@ -347,7 +347,7 @@ def test_convert_dates(self):
347347
assert_series_equal(result,ts)
348348

349349
def test_date_format(self):
350-
350+
351351
df = self.tsframe.copy()
352352
df['date'] = Timestamp('20130101')
353353
df_orig = df.copy()
@@ -412,7 +412,7 @@ def test_misc_example(self):
412412
@network
413413
@slow
414414
def test_round_trip_exception_(self):
415-
# GH 3867
415+
# GH 3867
416416

417417
df = pd.read_csv('https://raw.github.com/hayd/lahman2012/master/csvs/Teams.csv')
418418
s = df.to_json()
@@ -429,9 +429,9 @@ def test_url(self):
429429
result = read_json(url,convert_dates=True)
430430
for c in ['created_at','closed_at','updated_at']:
431431
self.assert_(result[c].dtype == 'datetime64[ns]')
432-
432+
433433
url = 'http://search.twitter.com/search.json?q=pandas%20python'
434434
result = read_json(url)
435-
435+
436436
except urllib2.URLError:
437437
raise nose.SkipTest

‎pandas/sparse/array.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
import numpy as np
99

1010
import operator
11+
from pandas.core.base import PandasObject
1112
import pandas.core.common as com
1213

1314
from pandas.util import py3compat
@@ -86,8 +87,7 @@ def _sparse_fillop(this, other, name):
8687

8788
return result, result_index
8889

89-
90-
class SparseArray(np.ndarray):
90+
class SparseArray(PandasObject, np.ndarray):
9191
"""Data structure for labeled, sparse floating point data
9292
9393
Parameters
@@ -184,9 +184,9 @@ def __setstate__(self, state):
184184
def __len__(self):
185185
return self.sp_index.length
186186

187-
def __repr__(self):
188-
return '%s\n%s' % (np.ndarray.__repr__(self),
189-
repr(self.sp_index))
187+
def __unicode__(self):
188+
return '%s\n%s' % (com.pprint_thing(self),
189+
com.pprint_thing(self.sp_index))
190190

191191
# Arithmetic operators
192192

‎pandas/sparse/list.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,12 @@
11
import numpy as np
2+
from pandas.core.base import PandasObject
3+
from pandas.core.common import pprint_thing
24

35
from pandas.sparse.array import SparseArray
46
import pandas._sparse as splib
57

68

7-
class SparseList(object):
9+
class SparseList(PandasObject):
810
"""
911
Data structure for accumulating data to be converted into a
1012
SparseArray. Has similar API to the standard Python list
@@ -21,9 +23,9 @@ def __init__(self, data=None, fill_value=np.nan):
2123
if data is not None:
2224
self.append(data)
2325

24-
def __repr__(self):
26+
def __unicode__(self):
2527
contents = '\n'.join(repr(c) for c in self._chunks)
26-
return '%s\n%s' % (object.__repr__(self), contents)
28+
return '%s\n%s' % (object.__repr__(self), pprint_thing(contents))
2729

2830
def __len__(self):
2931
return sum(len(c) for c in self._chunks)

‎pandas/sparse/series.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -241,8 +241,9 @@ def __setstate__(self, state):
241241
def __len__(self):
242242
return self.sp_index.length
243243

244-
def __repr__(self):
245-
series_rep = Series.__repr__(self)
244+
def __unicode__(self):
245+
# currently, unicode is same as repr...fixes infinite loop
246+
series_rep = Series.__unicode__(self)
246247
rep = '%s\n%s' % (series_rep, repr(self.sp_index))
247248
return rep
248249

‎pandas/stats/fama_macbeth.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
from pandas.core.base import StringMixin
12
from pandas.util.py3compat import StringIO
23

34
import numpy as np
@@ -26,7 +27,7 @@ def fama_macbeth(**kwargs):
2627
return klass(**kwargs)
2728

2829

29-
class FamaMacBeth(object):
30+
class FamaMacBeth(StringMixin):
3031
def __init__(self, y, x, intercept=True, nw_lags=None,
3132
nw_lags_beta=None,
3233
entity_effects=False, time_effects=False, x_effects=None,
@@ -114,7 +115,7 @@ def _coef_table(self):
114115

115116
return buffer.getvalue()
116117

117-
def __repr__(self):
118+
def __unicode__(self):
118119
return self.summary
119120

120121
@cache_readonly

‎pandas/stats/ols.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
import numpy as np
1111

1212
from pandas.core.api import DataFrame, Series, isnull
13+
from pandas.core.base import StringMixin
1314
from pandas.core.common import _ensure_float64
1415
from pandas.core.index import MultiIndex
1516
from pandas.core.panel import Panel
@@ -22,7 +23,7 @@
2223
_FP_ERR = 1e-8
2324

2425

25-
class OLS(object):
26+
class OLS(StringMixin):
2627
"""
2728
Runs a full sample ordinary least squares regression.
2829
@@ -581,7 +582,7 @@ def summary(self):
581582

582583
return template % params
583584

584-
def __repr__(self):
585+
def __unicode__(self):
585586
return self.summary
586587

587588
@cache_readonly

‎pandas/stats/var.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
from __future__ import division
22

33
import numpy as np
4-
4+
from pandas.core.base import StringMixin
55
from pandas.util.decorators import cache_readonly
66
from pandas.core.frame import DataFrame
77
from pandas.core.panel import Panel
@@ -11,7 +11,7 @@
1111
from pandas.stats.ols import _combine_rhs
1212

1313

14-
class VAR(object):
14+
class VAR(StringMixin):
1515
"""
1616
Estimates VAR(p) regression on multivariate time series data
1717
presented in pandas data structures.
@@ -477,7 +477,7 @@ def _sigma(self):
477477

478478
return np.dot(resid, resid.T) / (n - k)
479479

480-
def __repr__(self):
480+
def __unicode__(self):
481481
return self.summary
482482

483483

‎pandas/tseries/index.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -488,7 +488,7 @@ def _mpl_repr(self):
488488
# how to represent ourselves to matplotlib
489489
return tslib.ints_to_pydatetime(self.asi8, self.tz)
490490

491-
def __repr__(self):
491+
def __unicode__(self):
492492
from pandas.core.format import _format_datetime64
493493
values = self.values
494494

@@ -514,8 +514,6 @@ def __repr__(self):
514514

515515
return summary
516516

517-
__str__ = __repr__
518-
519517
def __reduce__(self):
520518
"""Necessary for making this object picklable"""
521519
object_state = list(np.ndarray.__reduce__(self))

‎pandas/tseries/period.py

Lines changed: 3 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33

44
from datetime import datetime, date
55
import numpy as np
6+
from pandas.core.base import PandasObject
67

78
import pandas.tseries.offsets as offsets
89
from pandas.tseries.frequencies import (get_freq_code as _gfc,
@@ -40,7 +41,7 @@ def f(self):
4041
return property(f)
4142

4243

43-
class Period(object):
44+
class Period(PandasObject):
4445
"""
4546
Represents an period of time
4647
@@ -272,28 +273,6 @@ def __repr__(self):
272273

273274
return "Period('%s', '%s')" % (formatted, freqstr)
274275

275-
def __str__(self):
276-
"""
277-
Return a string representation for a particular DataFrame
278-
279-
Invoked by str(df) in both py2/py3.
280-
Yields Bytestring in Py2, Unicode String in py3.
281-
"""
282-
283-
if py3compat.PY3:
284-
return self.__unicode__()
285-
return self.__bytes__()
286-
287-
def __bytes__(self):
288-
"""
289-
Return a string representation for a particular DataFrame
290-
291-
Invoked by bytes(df) in py3 only.
292-
Yields a bytestring in both py2/py3.
293-
"""
294-
encoding = com.get_option("display.encoding")
295-
return self.__unicode__().encode(encoding, 'replace')
296-
297276
def __unicode__(self):
298277
"""
299278
Return a string representation for a particular DataFrame
@@ -303,9 +282,7 @@ def __unicode__(self):
303282
"""
304283
base, mult = _gfc(self.freq)
305284
formatted = tslib.period_format(self.ordinal, base)
306-
value = (u"%s" % formatted)
307-
assert type(value) == unicode
308-
285+
value = ("%s" % formatted)
309286
return value
310287

311288

0 commit comments

Comments
 (0)
Please sign in to comment.