Skip to content

CI/BUG?: test_pct_max_many_rows crashes on travis-27. #23726

@h-vetinari

Description

@h-vetinari
Contributor

Following Ian Fleming ("Once is happenstance. Twice is coincidence. Three times is enemy action"), here are the three most recent builds from #23582 -- which only changes tests -- wherein there is a single failure/crash in the travis-27.yaml build for pandas/tests/series/test_rank.py::test_pct_max_many_rows, which looks like this:

[...]
........................................................................ [ 87%]
..........................................................[gw0] node down: Not properly terminated
fReplacing crashed worker gw0
gw2 [39907] / gw1 [39907]. [ 87%]
........................................................................ [ 87%]
........................................................................ [ 87%]
[...]
........................................................................ [ 99%]
.....
=================================== FAILURES ===================================
_______________________ pandas/tests/series/test_rank.py _______________________
[gw0] linux2 -- Python 2.7.15 /home/travis/miniconda3/envs/pandas/bin/python
Worker 'gw0' crashed while running 'pandas/tests/series/test_rank.py::test_pct_max_many_rows'

https://travis-ci.org/pandas-dev/pandas/jobs/455175414
https://travis-ci.org/pandas-dev/pandas/jobs/455259318
https://travis-ci.org/pandas-dev/pandas/jobs/455642129

Interestingly, the azure 2.7 builds pass both for windows and linux. Any ideas?

Activity

jbrockmendel

jbrockmendel commented on Nov 15, 2018

@jbrockmendel
Member

I’ve noticed this too. I’m guessing related to #23688. @jschendel any ideas?

h-vetinari

h-vetinari commented on Nov 15, 2018

@h-vetinari
ContributorAuthor

@jbrockmendel
Definitely something weird going on.

>>> import numpy as np
>>> import pandas as pd
>>> s = pd.Series(np.arange(2**24 + 1))
>>>
>>> s.astype(int).rank(pct=True).max()
1.0
>>> s.astype(np.int32).rank(pct=True).max()
1.0
>>> s.astype(np.int64).rank(pct=True).max()
1.0
>>> s.astype(np.uint32).rank(pct=True).max()
1.0
>>> s.astype(np.uint64).rank(pct=True).max()
1.0
>>> s.astype(float).rank(pct=True).max()
1.0
>>> s.astype(np.float64).rank(pct=True).max()
1.0
>>> s.astype(np.float32).rank(pct=True).max()
MemoryError
>>> # all following calls to s.astype(<whatever>).rank(pct=True), e.g.
>>> s.astype(int).rank(pct=True).max()
MemoryError
>>> # Interestingly, even after the MemoryError, pct=False works
>>> s.astype(int).rank().max()
16777217.0

I can't really see how anything in that test should cast to float32, but maybe there's a resource leak somewhere?

TomAugspurger

TomAugspurger commented on Nov 16, 2018

@TomAugspurger
Contributor

Is this crash 2.7 only, or have we seen it on python 3 builds?

h-vetinari

h-vetinari commented on Nov 16, 2018

@h-vetinari
ContributorAuthor

I've only ever seen it in the travis-27 build.

TomAugspurger

TomAugspurger commented on Nov 16, 2018

@TomAugspurger
Contributor

Unfortunately I haven't been able to reproduce the segfault locally on my Mac.

added
Needs TestsUnit test(s) needed to prevent regressions
AlgosNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff
on Nov 16, 2018
h-vetinari

h-vetinari commented on Nov 19, 2018

@h-vetinari
ContributorAuthor

I can reproduce the failure locally (on windows), but I only get a MemoryError, not a segfault. If it were a MemoryError, it could be an explanation why it only fails sometimes - depending on how busy the worker is with other code at that moment.

5 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    AlgosNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diffNeeds TestsUnit test(s) needed to prevent regressions

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      Participants

      @jreback@TomAugspurger@jschendel@jbrockmendel@gfyoung

      Issue actions

        CI/BUG?: test_pct_max_many_rows crashes on travis-27. · Issue #23726 · pandas-dev/pandas