Skip to content

ENH: Set numeric_only=True in aggregations #34521

@WillAyd

Description

@WillAyd

I've noticed that I make this mistake quite often:

>>> df = pd.DataFrame({"a": ["1"] * 3, "b": np.ones(3)})
>>> df.sum()
a    111
b      3
dtype: object

Getting 111 as a result in column a is harmless in this example, but actually quite annoying in most real life use cases where it can produce exceedingly large strings that exhaust memory or tie up the interpreter.

The numeric_only argument gets around this issue:

>>> df.sum(numeric_only=True)
b    3.0
dtype: float64

Though I'm curious if this should really be the default

Metadata

Metadata

Assignees

No one assigned

    Labels

    DeprecateFunctionality to remove in pandasDuplicate ReportDuplicate issue or pull request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions