Skip to content

BUG: pd.api.extensions.take does not accept NumpyExtensionArray #59177

@andrewgsavage

Description

@andrewgsavage

Pandas version checks

  • I have checked that this issue has not already been reported.

    I have confirmed this bug exists on the latest version of pandas.

    I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd 
import numpy as np

arr = pd.arrays.NumpyExtensionArray(np.array([1,2,3]))
pd.api.extensions.take(arr,[2])

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\mambaforge\envs\test_py300\Lib\site-packages\pandas\core\algorithms.py", line 1166, in take
    raise TypeError(
TypeError: pd.api.extensions.take requires a numpy.ndarray, ExtensionArray, Index, or Series, got NumpyExtensionArray.

Issue Description

I'd expect NumpyExtensionArray to be considered an ExtensionArray, so compatable with pd.api.extensions.take.

Expected Behavior

return

<NumpyExtensionArray>
[2]
Length: 1, dtype: int32

Installed Versions

Replace this line with the output of pd.show_versions()

Activity

Aloqeely

Aloqeely commented on Jul 4, 2024

@Aloqeely
Member

Thanks for the report! I'm surprised that NumpyExtensionArray fails the isinstance check. cc @jbrockmendel

On a separate matter, is there a specific reason you're using NumpyExtensionArray? I don't think there are many benefits to using it.

mutricyl

mutricyl commented on Jul 4, 2024

@mutricyl
Contributor

For the record this issue is linked to pint-pandas development and particularly hgrecco/pint-pandas#239.

New version of pandas.core.algorithms.take function introduced in #52981 is very restrictive and I wonder why NumpyExtensionArray is not in the list.

NumpyExtensionArray is used as pd.array is used in PintArray constructor. Depending on provided dtype pd.array will result on a ExtensionArray subclass or will fall back on NumpyExtensionArray.

For example:

>>> pd.array([1+1j,2,3]).dtype  
NumpyEADtype('complex128') # NumpyExtensionArray
>>> pd.array([1,2,3]).dtype    
Int64Dtype() # ExtensionArray

There is no build in pandas ExtensionArray for complex so pd.array falls back to NumpyExtensionArray

rhshadrach

rhshadrach commented on Jul 4, 2024

@rhshadrach
Member

If the check is disabled, the take operation appears to work without issue. Agreed this should work - PRs to fix are welcome.

added
AlgosNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff
and removed
Needs TriageIssue that has not been reviewed by a pandas team member
on Jul 4, 2024
mutricyl

mutricyl commented on Jul 4, 2024

@mutricyl
Contributor

take

added a commit that references this issue on Jul 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

AlgosNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diffBug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    Participants

    @andrewgsavage@rhshadrach@Aloqeely@mutricyl

    Issue actions

      BUG: pd.api.extensions.take does not accept NumpyExtensionArray · Issue #59177 · pandas-dev/pandas