-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
Description
Code Sample, a copy-pastable example if possible
import pandas as pd
import numpy as np
A = [100, 100, 200, 200, 300, 300]
B = [10, 10, 20, 21, 31,33]
C = np.random.randint(0, 99, 6)
test_df = pd.DataFrame({'A': A, 'B': B, 'C': C})
test_df = test_df.set_index(['A', 'B'])
print(test_df)
try:
print(test_df.loc[(100,10)])
except:
pass
try:
print(test_df.loc[(0,1)])
except KeyError:
print('test_df.loc[(0,1)] raises a KeyError')
try:
print(test_df.loc[(100,1)])
except KeyError as e:
print(e)
print('test_df.loc[(100,1)] raises a KeyError')
except TypeError as e:
print(e)
print('test+df.loc[(100,1)]) raises a TypeError')
Problem description
If the value is not present in the index, I believe that a KeyError should be raised consistently, so you can write code like.
try:
df.loc[tuple]
except KeyError:
# do something if the value isn't present
Expected Output
If df.loc[tuple] does not have a match in the multiindex, a KeyError should be raised.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.22.0
pytest: None
pip: 9.0.3
setuptools: 39.0.1
Cython: None
numpy: 1.14.2
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.2
pytz: 2018.4
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
Activity
jreback commentedon May 4, 2018
cc @toobaz this closed by your recent MI warning?
toobaz commentedon May 4, 2018
Unfortunately not - my PR should have only affected list(-like)s of keys, not single keys. This is rather related to #19110 and #17024 (and possibly more). Basically, since
100
is found in theindex
(partial indexing),1
is looked in thecolumns
rather than in the second level of theindex
. Which is good (or at least, too late to break it) - except that if1
is not found in thecolumns
, we should retrieve the original exception.MasterAir commentedon May 9, 2018
I'm happy to have a go fixing this behaviour - not sure how successful I'll be or when I'll have time. Is the expected behaviour that I've specified in the issue report correct?
Should missing keys always raise a KeyError?
toobaz commentedon May 9, 2018
The basic idea (real code is more complicated) is to replace something like
with something like:
pd.Series(index=list('abc')).loc[1]
raisesTypeError
, and (although I don't like it,) it is unrelated to the present issue. But yes,print(test_df.loc[(100,1)])
above should result in aKeyError
.toobaz commentedon Jun 20, 2018
It's actually not so simple, because we still want to raise the current error when the index is not a
MultiIndex
(and maybe even when it is, but the key is not a plausibleMultiIndex
key/indexer).phofl commentedon Nov 10, 2020
This works now, the last statement raises
lklamt commentedon Feb 15, 2021
After doing some testing, I agree, the concrete error above seems to be solved in pandas version 1.2.2. Can the Issue be closed?