Skip to content

TestDataFrame.test_from_records_sequencelike: segfault on armel #4473

Closed
@yarikoptic

Description

@yarikoptic
Contributor

gory details

Starting program: /usr/bin/python2.7-dbg /usr/bin/nosetests -s -v -a \!network pandas/tests/test_frame.py:TestDataFrame.test_from_records_sequencelike
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabi/libthread_db.so.1".

Program received signal SIGILL, Illegal instruction.

Program received signal SIGILL, Illegal instruction.

test_from_records_sequencelike (pandas.tests.test_frame.TestDataFrame) ...
Program received signal SIGSEGV, Segmentation fault.
0xb59b5478 in __pyx_pf_6pandas_3lib_122is_string_array (__pyx_self=0x0, __pyx_v_values=0x1727668) at pandas/lib.c:32348
32348         __Pyx_INCREF((PyObject*)__pyx_t_3);

(gdb) bt
#0  0xb59b5478 in __pyx_pf_6pandas_3lib_122is_string_array (__pyx_self=0x0, __pyx_v_values=0x1727668) at pandas/lib.c:32348
#1  0xb59b4248 in __pyx_pw_6pandas_3lib_123is_string_array (__pyx_self=0x0, __pyx_v_values=<numpy.ndarray at remote 0x1727668>) at pandas/lib.c:32136
#2  0x0009e498 in PyCFunction_Call (func=<built-in function is_string_array>, arg=(<numpy.ndarray at remote 0x1727668>,), kw=0x0) at ../Objects/methodobject.c:101
#3  0x0002e178 in PyObject_Call (func=<built-in function is_string_array>, arg=(<numpy.ndarray at remote 0x1727668>,), kw=0x0) at ../Objects/abstract.c:2529
#4  0xb59abe48 in __pyx_pf_6pandas_3lib_108infer_dtype (__pyx_self=0x0, __pyx_v__values=<numpy.ndarray at remote 0x1727668>) at pandas/lib.c:30465
#5  0xb59a81d8 in __pyx_pw_6pandas_3lib_109infer_dtype (__pyx_self=0x0, __pyx_v__values=<numpy.ndarray at remote 0x1727668>) at pandas/lib.c:29761
#6  0x00152aa4 in call_function (pp_stack=0xbeffaa74, oparg=1) at ../Python/ceval.c:4009
#7  0x0014d524 in PyEval_EvalFrameEx (
    f=Frame 0x12bf850, for file /home/yoh/pandas/pandas-0.12.0/pandas/core/common.py, line 1220, in _possibly_cast_to_datetime (value=<numpy.ndarray at remote 0x1727668>, dtype=None, coerce=False, v=<numpy.ndarray at remote 0x1727668>), throwflag=0) at ../Python/ceval.c:2666

python backtrace

(gdb) py-bt
#7 Frame 0x12bf850, for file /home/yoh/pandas/pandas-0.12.0/pandas/core/common.py, line 1220, in _possibly_cast_to_datetime (value=<numpy.ndarray at remote 0x1727668>, dtype=None, coerce=False, v=<numpy.ndarray at remote 0x1727668>)
    inferred_type = lib.infer_dtype(v)
#11 Frame 0x12a37b0, for file /home/yoh/pandas/pandas-0.12.0/pandas/core/series.py, line 3318, in _try_cast (arr=<numpy.ndarray at remote 0x1727668>, take_fast_path=True)
    arr = com._possibly_cast_to_datetime(arr, dtype)
#15 Frame 0x115d810, for file /home/yoh/pandas/pandas-0.12.0/pandas/core/series.py, line 3349, in _sanitize_array (data=<numpy.ndarray at remote 0x1727668>, index=<Int64Index(name=None) at remote 0x17289d8>, dtype=None, copy=False, raise_cast_failure=False, _try_cast=<function at remote 0x1728a30>, subarr=<numpy.ndarray at remote 0x1727668>)
    subarr = _try_cast(data, True)
#19 Frame 0x17579c0, for file /home/yoh/pandas/pandas-0.12.0/pandas/core/frame.py, line 5936, in _homogenize (data=[<numpy.ndarray at remote 0x1723ea8>, <numpy.ndarray at remote 0x1723f68>, <numpy.ndarray at remote 0x17273e8>, <numpy.ndarray at remote 0x1727668>, <numpy.ndarray at remote 0x17271a8>, <numpy.ndarray at remote 0x17274e8>, <numpy.ndarray at remote 0x1727228>, <numpy.ndarray at remote 0x17270e8>], index=<Int64Index(name=None) at remote 0x17289d8>, dtype=None, _sanitize_array=<function at remote 0x10f1b90>, oindex=None, homogenized=[<numpy.ndarray at remote 0x1723ea8>, <numpy.ndarray at remote 0x1723f68>, <numpy.ndarray at remote 0x17273e8>], v=<numpy.ndarray at remote 0x1727668>)
    raise_cast_failure=False)
#23 Frame 0x17571e0, for file /home/yoh/pandas/pandas-0.12.0/pandas/core/frame.py, line 5675, in _arrays_to_mgr (arrays=[<numpy.ndarray at remote 0x1723ea8>, <numpy.ndarray at remote 0x1723f68>, <numpy.ndarray at remote 0x17273e8>, <numpy.ndarray at remote 0x1727668>, <numpy.ndarray at remote 0x17271a8>, <numpy.ndarray at remote 0x17274e8>, <numpy.ndarray at remote 0x1727228>, <numpy.ndarray at remote 0x17270e8>], arr_names=<Index(name=None) at remote 0x17286c0>, index=<Int64Index(name=None) at remote 0x17289d8>, columns=<...>, dtype=None)
    arrays = _homogenize(arrays, index, dtype)
#27 Frame 0x17ce8a0, for file /home/yoh/pandas/pandas-0.12.0/pandas/core/frame.py, line 1124, in from_records (cls=<type at remote 0x129f680>, data=<recarray at remote 0x165dc40>, index=None, exclude=set([]), columns=<Index(name=None) at remote 0x17286c0>, coerce_float=False, nrows=None, arr_columns=<...>, arrays=[<numpy.ndarray at remote 0x1723ea8>, <numpy.ndarray at remote 0x1723f68>, <numpy.ndarray at remote 0x17273e8>, <numpy.ndarray at remote 0x1727668>, <numpy.ndarray at remote 0x17271a8>, <numpy.ndarray at remote 0x17274e8>, <numpy.ndarray at remote 0x1727228>, <numpy.ndarray at remote 0x17270e8>], result_index=None)
    columns)

and here what I think was original/generated code

(gdb) bt 1
#0  0xb59b5478 in __pyx_pf_6pandas_3lib_122is_string_array (__pyx_self=0x0, __pyx_v_values=0x1727668) at pandas/lib.c:32348
(More stack frames follow...)
(gdb) l 32335
32330         /* "pandas/src/inference.pyx":230
32331    * 
32332    *         for i in range(n):
32333    *             if not PyString_Check(objbuf[i]):             # <<<<<<<<<<<<<<
32334    *                 return False
32335    *         return True
32336    */
32337         __pyx_t_13 = __pyx_v_i;
32338         __pyx_t_8 = -1;
32339         if (__pyx_t_13 < 0) {
(gdb) 
32340           __pyx_t_13 += __pyx_pybuffernd_objbuf.diminfo[0].shape;
32341           if (unlikely(__pyx_t_13 < 0)) __pyx_t_8 = 0;
32342         } else if (unlikely(__pyx_t_13 >= __pyx_pybuffernd_objbuf.diminfo[0].shape)) __pyx_t_8 = 0;
32343         if (unlikely(__pyx_t_8 != -1)) {
32344           __Pyx_RaiseBufferIndexError(__pyx_t_8);
32345           {__pyx_filename = __pyx_f[1]; __pyx_lineno = 230; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
32346         }
32347         __pyx_t_3 = (PyObject *) *__Pyx_BufPtrStrided1d(PyObject **, __pyx_pybuffernd_objbuf.rcbuffer->pybuffer.buf, __pyx_t_13, __pyx_pybuffernd_objbuf.diminfo[0].strides);
32348         __Pyx_INCREF((PyObject*)__pyx_t_3);
32349         __pyx_t_7 = ((!(PyString_Check(__pyx_t_3) != 0)) != 0);

any obvious clues? ;)

Activity

jreback

jreback commented on Oct 3, 2013

@jreback
Contributor

@yarikoptic this still occurring?

yarikoptic

yarikoptic commented on Oct 3, 2013

@yarikoptic
ContributorAuthor

I would/will need to check... recent build of 0.12 still had it though: test_from_records_sequencelike (pandas.tests.test_frame.TestDataFrame) ... Segmentation fault

yarikoptic

yarikoptic commented on Oct 8, 2013

@yarikoptic
ContributorAuthor

unfortunately current master (v0.12.0-761-gaa9d9b9) is still susceptible:

test_from_records_sequencelike (pandas.tests.test_frame.TestDataFrame) ...
Program received signal SIGSEGV, Segmentation fault.
0xb5a85898 in __pyx_pf_6pandas_3lib_118is_string_array (__pyx_self=0x0, __pyx_v_values=0x1671068) at pandas/lib.c:33183
33183         __Pyx_INCREF((PyObject*)__pyx_t_3);
(gdb) bt 10
#0  0xb5a85898 in __pyx_pf_6pandas_3lib_118is_string_array (__pyx_self=0x0, __pyx_v_values=0x1671068) at pandas/lib.c:33183
#1  0xb5a84664 in __pyx_pw_6pandas_3lib_119is_string_array (__pyx_self=0x0, __pyx_v_values=<numpy.ndarray at remote 0x1671068>) at pandas/lib.c:32971
#2  0x0009e498 in PyCFunction_Call (func=<built-in function is_string_array>, arg=(<numpy.ndarray at remote 0x1671068>,), kw=0x0) at ../Objects/methodobject.c:101
#3  0x0002e178 in PyObject_Call (func=<built-in function is_string_array>, arg=(<numpy.ndarray at remote 0x1671068>,), kw=0x0) at ../Objects/abstract.c:2529
#4  0xb5a7c258 in __pyx_pf_6pandas_3lib_104infer_dtype (__pyx_self=0x0, __pyx_v__values=<numpy.ndarray at remote 0x1671068>) at pandas/lib.c:31301
#5  0xb5a785dc in __pyx_pw_6pandas_3lib_105infer_dtype (__pyx_self=0x0, __pyx_v__values=<numpy.ndarray at remote 0x1671068>) at pandas/lib.c:30599
#6  0x00152acc in call_function (pp_stack=0xbeffaa84, oparg=1) at ../Python/ceval.c:4009
#7  0x0014d54c in PyEval_EvalFrameEx (
    f=Frame 0x10b1e40, for file /home/yoh/pandas/pandas/pandas/core/common.py, line 1423, in _possibly_cast_to_datetime (value=<numpy.ndarray at remote 0x1671068>, dtype=None, coerce=False, v=<numpy.ndarray at remote 0x1671068>), throwflag=0) at ../Python/ceval.c:2666
#8  0x0014fdc4 in PyEval_EvalCodeEx (co=0xc30da8, 
    globals={'_mut_exclusive': <function at remote 0xc92878>, '_values_from_object': <function at remote 0xc925b8>, '_any_none': <function at remote 0xc928d0>, '_to_pydatetime': <function at remote 0xc93be8>, 'LooseVersion': <classobj at remote 0xb05b78>, '_pad_2d_datetime': <function at remote 0xc921f0>, '_where_compat': <function at remote 0xc93c40>, '_take_nd_generic': <function at remote 0xc8f7c8>, 'intersection': <function at remote 0xc92ea8>, '_pprint_dict': <function at remote 0xc93ea8>, '_maybe_promote': <function at remote 0xc8fea8>, '_INT64_DTYPE': <numpy.dtype at remote 0xb6744768>, 'UTF8Recoder': <classobj at remote 0xc8e088>, 'interpolate_2d': <function at remote 0xc92458>, '_maybe_upcast': <function at remote 0xc8ff58>, 'collections': <module at remote 0xb6c69460>, '_default_index': <function at remote 0xc927c8>, 'rands': <function at remote 0xc92a30>, 'save': <function at remote 0xc95090>, 'BytesIO': <classobj at remote 0xb6b84628>, '_ensure_object': <built-in function ensure_object>, '_lcd_dtypes': <funct...(truncated), locals=0x0, args=0x11c229c, argcount=2, kws=0x11c22a4, 
    kwcount=0, defs=0xc91d6c, defcount=1, closure=0x0) at ../Python/ceval.c:3253
#9  0x00153390 in fast_function (func=<function at remote 0xc92718>, pp_stack=0xbeffad54, n=2, na=2, nk=0) at ../Python/ceval.c:4117
(More stack frames follow...)

and that is the full function code

32960   /* Python wrapper */
32961   static PyObject *__pyx_pw_6pandas_3lib_119is_string_array(PyObject *__pyx_self, PyObject *__pyx_v_values); /*proto*/
32962   static PyMethodDef __pyx_mdef_6pandas_3lib_119is_string_array = {__Pyx_NAMESTR("is_string_array"), (PyCFunction)__pyx_pw_6pandas_3lib_119is_string_array, METH_O, __Pyx_DOCSTR(0)};
32963   static PyObject *__pyx_pw_6pandas_3lib_119is_string_array(PyObject *__pyx_self, PyObject *__pyx_v_values) {
32964     CYTHON_UNUSED int __pyx_lineno = 0;
(gdb) 
32965     CYTHON_UNUSED const char *__pyx_filename = NULL;
32966     CYTHON_UNUSED int __pyx_clineno = 0;
32967     PyObject *__pyx_r = 0;
32968     __Pyx_RefNannyDeclarations
32969     __Pyx_RefNannySetupContext("is_string_array (wrapper)", 0);
32970     if (unlikely(!__Pyx_ArgTypeTest(((PyObject *)__pyx_v_values), __pyx_ptype_5numpy_ndarray, 1, "values", 0))) {__pyx_filename = __pyx_f[1]; __pyx_lineno = 215; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
32971     __pyx_r = __pyx_pf_6pandas_3lib_118is_string_array(__pyx_self, ((PyArrayObject *)__pyx_v_values));
32972     goto __pyx_L0;
32973     __pyx_L1_error:;
32974     __pyx_r = NULL;

in python it is at

(gdb) py-bt
#7 Frame 0x10b1e40, for file /home/yoh/pandas/pandas/pandas/core/common.py, line 1423, in _possibly_cast_to_datetime (value=<numpy.ndarray at remote 0x1671068>, dtype=None, coerce=False, v=<numpy.ndarray at remote 0x1671068>)
    inferred_type = lib.infer_dtype(v)
#11 Frame 0x11c2140, for file /home/yoh/pandas/pandas/pandas/core/series.py, line 2461, in _try_cast (arr=<numpy.ndarray at remote 0x1671068>, take_fast_path=True)
    arr = _possibly_cast_to_datetime(arr, dtype)

seems to happen when v passed to 'infer_dtype' is array(['foo', 'foo', 'foo', 'foo', 'foo', 'foo'], dtype=object) ... and I do not see anything obviously wrong code there in pandas/src/inference.pyx thus tending to blame cython (0.19.1+git34-gac3e3a2-1) or even python... ? ;)

yarikoptic

yarikoptic commented on Oct 8, 2013

@yarikoptic
ContributorAuthor

weird thing (as commended in 5150) -- it doesn't segfault on my home armel boxy ... versions of everything are identical if I see it right... only few choices of packages are different... weird...

jreback

jreback commented on Oct 8, 2013

@jreback
Contributor

do you and/or the other armel have a locale set? (in ci/print_version.py)....?
could be a unicode passed to cython expecting string

yarikoptic

yarikoptic commented on Oct 8, 2013

@yarikoptic
ContributorAuthor

ah -- interesting idea!

on both boxes:

locale
LANG=
LANGUAGE=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=

and print_versions report

byteorder: little
LC_ALL: None
LANG: None

I guess I should just disable that test on armel for now and be done
with it!

On Tue, 08 Oct 2013, jreback wrote:

do you and/or the other armel have a locale set? (in
ci/print_version.py)....?
could be a unicode passed to cython expecting string

Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate, Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419
WWW: http://www.linkedin.com/in/yarik

jreback

jreback commented on Oct 8, 2013

@jreback
Contributor

gr8.....the locale thing is causing all kinds of weird havoc....I think most have been nailed....

jreback

jreback commented on Oct 11, 2013

@jreback
Contributor

closing as known failure of locale issues on armel

jtratner

jtratner commented on Oct 11, 2013

@jtratner
Contributor

Adding a skip test?

jreback

jreback commented on Oct 11, 2013

@jreback
Contributor

I think @yarikoptic is skipping locally (on armel only)

jtratner

jtratner commented on Oct 11, 2013

@jtratner
Contributor

Wouldn't you want it to pass on armel without tweaks?

On Fri, Oct 11, 2013 at 9:06 AM, jreback notifications@github.com wrote:

I think @yarikoptic https://github.com/yarikoptic is skipping locally
(on armel only)


Reply to this email directly or view it on GitHubhttps://github.com//issues/4473#issuecomment-26135331
.

jreback

jreback commented on Oct 11, 2013

@jreback
Contributor

in an ideal world sure, but its a locale issue and don't have any easy way to debug. If @yarikoptic wants to great.

jreback

jreback commented on Oct 11, 2013

@jreback
Contributor

ok....reopen so can add a knownfail test

16 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Testingpandas testing functions or related to the test suite

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @yarikoptic@jreback@jtratner

        Issue actions

          TestDataFrame.test_from_records_sequencelike: segfault on armel · Issue #4473 · pandas-dev/pandas