-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Open
Labels
API DesignIndexingRelated to indexing on series/frames, not to indexes themselvesRelated to indexing on series/frames, not to indexes themselvesNeeds DiscussionRequires discussion from core team before further actionRequires discussion from core team before further action
Description
some examples (on Series only) in #12890
I started making an overview of the indexing semantics with http://nbviewer.ipython.org/gist/jorisvandenbossche/7889b389a21b41bc1063 (only for series/frame, not for panel)
Conclusion: it is mess :-)
Summary for slicing
- Slicing with integer labels is:
- always integer location based
- except for a float indexer where it is label based
- Slicing with other types of labels is always label based if it is of appropriate type for the indexer.
So, you can say that the behaviour is equivalent to .ix, except that the behaviour for integer labels is different for integer indexers (swapped). (For .ix, when having an integer axis, it is always label based and no fallback to integer location based).
Summary for single label
- Indexing with a single label is always label based
- But, there is fallback to integer location based, except for integer and float indexers
Summary for indexing with list of labels
- It is primarily label based, but:
- There is fallback to integer location based apart from int/float integer axis
- It is a pure reindex, also if no label of the list is found, you just get an all NaN series (which contrasts with loc, where at least one label should be found)
- String parsing for a datetime index does not seem to work
This mainly follows ix, apart from points 2 and 3
Summary for boolean indexing
- This is simple, it just works as expected
Summary for DataFrames
- It uses the 'information' axis (axis 1) for:
- single labels
- list of labels
- It uses the rows (axis 0) for:
- slicing
- boolean indexing
This is as documented (only the boolean case is not explicitely documented I think).
For the rest (on the choses axis), it follows the same semantics as [] on a series, but:
- for a list of labels, now all labels must be present (no pure reindex as with series)
- for single labels: no fallback to integer location based for non-numeric index (but this does fallback for a list of labels ...)
Questions are here:
- Are there things we can change? (that would not be too disruptive .. maybe not?) And want change?
- How do we document this best?
- Now you have the "basics" section (http://pandas.pydata.org/pandas-docs/stable/indexing.html#basics) and the slicing section (http://pandas.pydata.org/pandas-docs/stable/indexing.html#slicing-ranges), but this does not cover all cases at all.
bondarevts, aivarannamaa, ondrej-pacovsky, plammens, sgpinkus and 2 more
Metadata
Metadata
Assignees
Labels
API DesignIndexingRelated to indexing on series/frames, not to indexes themselvesRelated to indexing on series/frames, not to indexes themselvesNeeds DiscussionRequires discussion from core team before further actionRequires discussion from core team before further action