Closed
Description
When trying to compare a dataframe to a column/series (I know, in the following case not useful due to the alignement of the series with the columns of the dataframe and not the rows, but it is something typical users will try), I get the correct results if there are strings in the dataframe and series, but a TypeError when the dataframe contains datetime values:
In [1]: from io import StringIO
In [2]: s = """id date birth_date_1 birth_date_2
...: 1 2000-01-01 2000-01-03 2000-01-05
...: 1 2000-01-07 2000-01-03 2000-01-05
...: 2 2000-01-02 2000-01-10 2000-01-01
...: 2 2000-01-05 2000-01-10 2000-01-01"""
In [3]: df = pd.read_csv(StringIO(s), sep='\s+')
In [5]: df[['birth_date_1','birth_date_2']] > df['date']
Out[5]:
0 1 2 3 birth_date_1 birth_date_2
0 False False False False True True
1 False False False False True True
2 False False False False True True
3 False False False False True True
In [7]: df = pd.read_csv(StringIO(s), sep='\s+', parse_dates=[1,2,3])
In [8]: df[['birth_date_1','birth_date_2']] > df['date']
...
c:\users\vdbosscj\scipy\pandas-joris\pandas\core\internals.pyc in handle_error()
954 if raise_on_error:
955 raise TypeError('Could not operate %s with block values
%s'
--> 956 % (repr(other), str(detail)))
957 else:
958 # return the values
TypeError: Could not operate array(['2000-01-01T01:00:00.000000000+0100',
'2000-01-07T01:00:00.000000000+0100',
'2000-01-02T01:00:00.000000000+0100',
'2000-01-05T01:00:00.000000000+0100'], dtype='datetime64[ns]') with block
values invalid type promotion
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
[-]BUG:[/-][+]BUG: comparing dataframe with datetime64 values to series gives TypeError[/+]jorisvandenbossche commentedon Dec 4, 2014
Although I am not sure this is the correct result:
There are no overlapping elements between the dataframe and series, but why then sometimes True and sometimes False?
jreback commentedon Dec 4, 2014
this is quite tricky; datetimes are not handled in a multi-column vectorized way correctly
xref to #8554. I think I can fix this but its a bit tricky.
jbrockmendel commentedon Oct 24, 2018
@jorisvandenbossche I'm not entirely clear on what the issue is here. Is it about broadcasting? Maybe it has been resolved in the interim?
mroeschke commentedon Mar 31, 2020
I think the first case raises a sensible error now (not date parsed)
The 2nd case doesn't seem to raise a sensible error as there is no float column being compared
[-]BUG: comparing dataframe with datetime64 values to series gives TypeError[/-][+]BUG: comparing multicolumn dataframe with datetime64 values to series gives TypeError[/+]jbrockmendel commentedon Dec 19, 2021
IIUC reindexing is introducing float (all-nan) columns, which then raise on comparison. That automatic reindexing was deprecated in #36795. we could try to get something in for 1.4 to give a better exception message, but i dont think its worth the trouble
jbrockmendel commentedon Mar 28, 2023
This now correctly raises because automatic alignment deprecation has been enforced. Is there another bug after that surfaces if we manually align before the comparison?
mroeschke commentedon May 10, 2025
Yeah it appears this raises consistently due to alignment. Closing