Skip to content

Using .loc.__setitem__ with integer slices does not raise #26412

@lumbric

Description

@lumbric

Code Sample

>>> d = pd.DataFrame({'a': range(5), 'b': range(2,7), 'c': range(3,8)})
>>> d
   a  b  c
0  0  2  3
1  1  3  4
2  2  4  5
3  3  5  6
4  4  6  7
>>> d.loc[0:2, 0:2] = 42   # note that loc is used with an integer slice! should this raise?
>>> d  # this is somehow the expected result
    a   b  c
0  42  42  3
1  42  42  4
2  42  42  5
3   3   5  6
4   4   6  7
>>> d.iloc[0:2, 0:2]        # expected result
    a   b
0  42  42
1  42  42
>>> d.loc[0:2, 0:2]        # this is the same syntax as above, but raises
...
TypeError: cannot do slice indexing on <class 'pandas.core.indexes.base.Index'> with these indexers [0] of <class 'int'>

Problem description

I'd expect a symmetry in d.loc.__getitem__ and d.loc.__setitem__. At the moment only getitem raises if integer slices are passed. Usage of .loc was probably a mistake, instead .iloc should have been used. Even if it works correctly, it is confusing for users.

Probably this is simply a missing exception in the special case of d.loc.__setitem__ being called with integer slices.

See #13831 for a very similar problems with pd.Series.

Expected Output

Assuming d.loc.__getitem__ should raise for integer slices (as it does), then also d.loc.__setitem__ should raise:

>>> d.loc[0:2, 0:2]
...
SomException: int & loc don't go together

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.8.final.0 python-bits: 64 OS: Linux OS-release: 4.18.0-20-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.24.2
pytest: 3.6.4
pip: 9.0.1
setuptools: 40.8.0
Cython: None
numpy: 1.16.2
scipy: 1.2.1
pyarrow: None
xarray: 0.11.3
IPython: 5.5.0
sphinx: 1.7.9
patsy: 0.4.1+dev
dateutil: 2.8.0
pytz: 2019.1
blosc: 1.5.1
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.5
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: 4.2.5
bs4: 4.5.0
html5lib: 0.999999999
sqlalchemy: None
pymysql: None
psycopg2: 2.7.5 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

Activity

jorisvandenbossche

jorisvandenbossche commented on May 18, 2019

@jorisvandenbossche
Member

@lumbric Thanks a lot for the report. That's certainly a bug as there should indeed be a symmetry between setitem/getitem (although not always, you can set with new labels to expand a dataframe in certain cases), and in any case: it is interpreting the 0:2 as positional which is the guarantee of loc that this should not happen.

It only seems to be the case for slicing, as a list of integers correctly raises:

In [9]: d.loc[0:2, [0, 1]] = 42
...
KeyError: "None of [Int64Index([0, 1], dtype='int64')] are in the [columns]"
added this to the Contributions Welcome milestone on May 18, 2019
jorisvandenbossche

jorisvandenbossche commented on May 18, 2019

@jorisvandenbossche
Member

Very welcome to try to look into the loc setitem code path were this may be coming from!

modified the milestones: Contributions Welcome, 1.1 on Feb 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIndexingRelated to indexing on series/frames, not to indexes themselves

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Participants

      @lumbric@jreback@jorisvandenbossche

      Issue actions

        Using .loc.__setitem__ with integer slices does not raise · Issue #26412 · pandas-dev/pandas