Skip to content

regression in 0.23.0 on rolling max with DatetimeIndex. #21096

@lexual

Description

@lexual
Contributor

Behaviour of .rolling() with datetimeindex appears to have changed or regressed between 0.22 & 0.23 versions of pandas.

See below for cut-down test case exhibiting behaviour.

Also see below for different output produced on same code on same data with 2 different pandas versions.

import numpy as np
import pandas as pd


def main():
    print('pandas', pd.__version__)
    n = 10
    index = pd.date_range(
        start='2018-1-1 01:00:00',
        freq='1min',
        periods=n,
    )
    s = pd.Series(
        data=0,
        index=index,
    )
    s.iloc[1] = np.nan
    s.iloc[-1] = 2
    print(s)
    maxes = s.rolling(window=f'{n}min').max()
    print(maxes.value_counts(dropna=False))


if __name__ == '__main__':
    main()

ACTUAL OUTPUT

pandas 0.22.0      
2018-01-01 01:00:00    0.0             
2018-01-01 01:01:00    NaN             
2018-01-01 01:02:00    0.0             
2018-01-01 01:03:00    0.0             
2018-01-01 01:04:00    0.0             
2018-01-01 01:05:00    0.0             
2018-01-01 01:06:00    0.0             
2018-01-01 01:07:00    0.0             
2018-01-01 01:08:00    0.0             
2018-01-01 01:09:00    2.0             
Freq: T, dtype: float64                
0.0    9           
2.0    1           
dtype: int64       
pandas 0.23.0      
2018-01-01 01:00:00    0.0             
2018-01-01 01:01:00    NaN             
2018-01-01 01:02:00    0.0             
2018-01-01 01:03:00    0.0             
2018-01-01 01:04:00    0.0             
2018-01-01 01:05:00    0.0             
2018-01-01 01:06:00    0.0             
2018-01-01 01:07:00    0.0             
2018-01-01 01:08:00    0.0             
2018-01-01 01:09:00    2.0             
Freq: T, dtype: float64                
0.0    10          
dtype: int64       

Activity

jreback

jreback commented on May 17, 2018

@jreback
Contributor

can you create the input programatically here (and update the top )

lexual

lexual commented on May 17, 2018

@lexual
ContributorAuthor

done @jreback , updated simpler test-case, completely in code in 1st comment.

lexual

lexual commented on May 19, 2018

@lexual
ContributorAuthor
$ git bisect bad 
39e7b6916b07982240bac87132848fb2665806a2 is the first bad commit
commit 39e7b6916b07982240bac87132848fb2665806a2
Author: Matt Kirk <matt@matthewkirk.com>
Date:   Wed Feb 14 18:13:19 2018 +0700

    Performance increase rolling min max (#19549)

:040000 040000 6543b93964c04b7f71c14a9992c8f70606e02c77 a7f4578db8843fdb0f9dbe02322584ced5628dd9 M      asv_bench
:040000 040000 3f68b43b26722b2f7a2874433e88965ca5d8bfda 2eaff46202589c76a8d85251cfdb052dd38660f1 M      doc
:040000 040000 e28eb2f13f13846afe64b82ce492a5a877b98be7 18784432eb0d9655cca3de34a02f5f514c44eaff M      pandas
:100755 100755 2332503e558ed7ee9d57e3b3153ae674c1d292ad c66979dd19ef039cd4b5172c65276d1340507631 M      setup.py
lexual

lexual commented on May 19, 2018

@lexual
ContributorAuthor
mroeschke

mroeschke commented on Nov 5, 2018

@mroeschke
Member

This looks fixed on master and could still use a regression test.

In [9]: pd.__version__
Out[9]: '0.24.0.dev0+911.g24ab22f75'

In [10]: s
Out[10]:
2018-01-01 01:00:00    0.0
2018-01-01 01:01:00    NaN
2018-01-01 01:02:00    0.0
2018-01-01 01:03:00    0.0
2018-01-01 01:04:00    0.0
2018-01-01 01:05:00    0.0
2018-01-01 01:06:00    0.0
2018-01-01 01:07:00    0.0
2018-01-01 01:08:00    0.0
2018-01-01 01:09:00    2.0
Freq: T, dtype: float64

In [11]: maxes = s.rolling(window=f'{n}min').max()

In [12]: maxes.value_counts(dropna=False)
Out[12]:
0.0    9
2.0    1
dtype: int64
added this to the 1.0 milestone on Nov 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      Participants

      @lexual@jreback@mroeschke

      Issue actions

        regression in 0.23.0 on rolling max with DatetimeIndex. · Issue #21096 · pandas-dev/pandas