-
-
Notifications
You must be signed in to change notification settings - Fork 19k
Description
With more complex dataframes, I often stumble over this:
In [1]: x = pandas.DataFrame({'a': [1,2,3], 'b': [4,5,6]})
In [2]: x.ix[0].a
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-8-8358f19a1e70> in <module>()
----> 1 x.ix[0].a
AttributeError: 'Series' object has no attribute 'a'
In [3]: x.a.ix[0]
Out[3]: 1
This is just an example, at other times I end up converting the Series
back to a DataFrame
because that's what the rest of the code expects.
I know that there have been attempts to solve this issue by adding attribute lookup to Series
(e.g. #1904) but they seem to come with a performance penalty.
Often, however, I care more about expressiveness than performance. I thus propose the addition of an option like:
x = DataFrame(data, slicing_returns_df=True)
Which will cause x.ix[0] or x.a to return again a DataFrame
rather than a Series
and make the above work. The default would be False so that there are no backward compat issues. Alternatively there could be a new DataFrame class that inherits from DataFrame and has the desired behavior.
I'm happy to gives this a crack, however, I wanted to first make sure that it's not only me who thinks that'd be a good idea or that this can't work for obvious reason X.
Activity
jreback commentedon Apr 2, 2013
are you looking for something like this? (this is 0.11-dev)
equiv in 0.10.1 is
x.get_value(0,'a')
jreback commentedon Apr 2, 2013
And if you ALWAYS want to force a df to return (the above is ALWAYS a scalar)
twiecki commentedon Apr 2, 2013
.get_value() does not seem to support multi-indices x.get_value([0,1], 'a').
However, x.loc seems to do exactly what I need. Thanks!
jreback commentedon Apr 2, 2013
@twiecki great!
yes, by definition
at
(andget_value
) return are for fast scalar access, see: http://pandas.pydata.org/pandas-docs/dev/indexing.html#fast-scalar-value-getting-and-settingwhile
loc
(andix
) are more flexibile label based (and have a small performance penatly in order to figure out what you are after)twiecki commentedon May 13, 2013
I tried this again just now but df.loc[0,:] returns a series again. Was this changed by any chance? This is with current master.
jreback commentedon May 13, 2013
You asked for a Series, this will always return a series, enclose the list of rows with a
[]
to always get a frameand FYI, the
0
is alabel
here (and not a location!),twiecki commentedon May 13, 2013
Perfect. Thanks!