Skip to content

API: integer Extension Array #20700

@jreback

Description

@jreback
Contributor

xref #8640

Could easily imagine an ExtensionArray which uses as an implementation a numpy array of the appropriate dtype and a bitmask in order to fully support Integer NA across the board. I don't think this would be too hard. As a bonus, would be zero-copy compat with pyarrow impl (for the future)

making these the actual default (e.g. when integers are inferred with or w/o nulls) might be non-trivial, but let's implement first. These would give rise to a hierarchy of dtypes, e.g. IntegerDtype, Int8Dtype

Activity

added this to the 0.24.0 milestone on Apr 15, 2018
jreback

jreback commented on Apr 15, 2018

@jreback
ContributorAuthor
added a commit that references this issue on May 13, 2018
3b75e85
jreback

jreback commented on May 13, 2018

@jreback
ContributorAuthor

here is a fully-function (extension-wise) integer na: https://github.com/jreback/pandas/tree/intna
doesnt break anything and coexists

I have enabled inference to accept the new types with a Registry, e.g.

In [1]: pd.Series([1,2,3, np.nan], dtype='Int8')
Out[1]: 
0      1
1      2
2      3
3    NaN
dtype: Int8

so construction is pretty flexible now.

next up is ops

cc @TomAugspurger @jorisvandenbossche

jorisvandenbossche

jorisvandenbossche commented on May 14, 2018

@jorisvandenbossche
Member

Cool!

Is your intention to do a PR to add this to pandas, or to have it as a separate package for now?

added a commit that references this issue on May 14, 2018
6fc19f9
jreback

jreback commented on May 14, 2018

@jreback
ContributorAuthor

still needs quite a bit more tests / work. (have arithmetic ops done, but need comparison, and more indexing tests)

But i think directly in pandas. Note that this does not actually switch the base inference (e.g. [1,2 ,3]) still resolves to int64, we can do that at a later point). I suspect will have to change quite a lot of tests as we assume float conversions in a myriad of ways.

added a commit that references this issue on May 21, 2018
0758f1d

25 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Dtype ConversionsUnexpected or buggy dtype conversionsExtensionArrayExtending pandas with custom dtypes or arrays.Missing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolateNumeric OperationsArithmetic, Comparison, and Logical operations

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      Participants

      @jreback@jorisvandenbossche

      Issue actions

        API: integer Extension Array · Issue #20700 · pandas-dev/pandas