Skip to content

form_blocks vs make_block inconsistency #19179

Closed
@jbrockmendel

Description

@jbrockmendel
Member

Taking a cue from #19174 to revisit some logic in core.internals. form_blocks and make_block have some very similar logic. The question here is: are the discrepancies between then intentional?

Taking some liberties to make it more obvious how the logic is shared, the current code looks like:

def make_block(values, placement, klass=None, ndim=None, dtype=None,
               fastpath=False):
    [...]
        dtype = dtype or values.dtype
        vtype = dtype.type

        if isinstance(values, SparseArray):
            block_type = 'sparse'
        elif issubclass(vtype, np.floating):
            block_type = 'float'
        elif (issubclass(vtype, np.integer) and
              issubclass(vtype, np.timedelta64)):
            block_type = 'timedelta'
        elif (issubclass(vtype, np.integer) and
              not issubclass(vtype, np.datetime64)):
            block_type = 'int'
        elif dtype == np.bool_:
            block_type = 'bool'
        elif issubclass(vtype, np.datetime64):
            assert not hasattr(values, 'tz')
            block_type = 'datetime'
        elif is_datetimetz(values):
            block_type = 'datetime_tz'
        elif issubclass(vtype, np.complexfloating):
            block_type = 'complex'
        elif is_categorical(values):
            block_type = 'cat'
        else:
            block_type = 'object'
[...]

def form_blocks(arrays, names, axes):
    [...]
        if is_sparse(v):
            block_type = 'sparse'
        elif issubclass(vtype, np.floating):
            block_type = 'float'
        elif issubclass(vtype, np.complexfloating):
            block_type = 'complex'
        elif issubclass(vtype, np.datetime64):
            assert not is_datetimetz(v)
            block_type = 'datetime'
        elif is_datetimetz(v):
            block_type = 'datetime_tz'
        elif issubclass(vtype, np.integer):
            block_type = 'int'
        elif dtype == np.bool_:
            block_type = 'bool'
        elif is_categorical(v):
            block_type = 'cat'
        else:
            block_type = 'object'

[...]

The two main differences here are 1) is_sparse encompasses slightly more than isinstance(values, SparseArray) and 2)timedelta case is missing form form_blocks. Anyone know why?

Activity

jreback

jreback commented on Jan 11, 2018

@jreback
Contributor

the make_block logic is more correct. you could remove form_blocks in favor of this I think.

added
InternalsRelated to non-user accessible pandas implementation
on Jan 11, 2018
added this to the 0.23.0 milestone on Jan 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    CleanInternalsRelated to non-user accessible pandas implementation

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      Participants

      @jreback@jbrockmendel

      Issue actions

        form_blocks vs make_block inconsistency · Issue #19179 · pandas-dev/pandas