Skip to content

Feature request: add median, mode & number of unique entries to pandas.DataFrame.describe() #7014

Closed
@tantrev

Description

@tantrev

Would come in handy for a number of a different applications from basic statistics, to understanding one's data, to estimating machine learning algorithmic load.

dupe of #2749

Activity

changed the title [-]Feature request: add number of unique entries to pandas.DataFrame.describe()[/-] [+]Feature request: add median & number of unique entries to pandas.DataFrame.describe()[/+] on Apr 30, 2014
changed the title [-]Feature request: add median & number of unique entries to pandas.DataFrame.describe()[/-] [+]Feature request: add median, mode & number of unique entries to pandas.DataFrame.describe()[/+] on Apr 30, 2014
jreback

jreback commented on Apr 30, 2014

@jreback
Contributor

FYI, you can easily make your own and patch it in. Just put in your startup code / application.

def describe(self):
    """ describe  of a series """
    l = [ ('nobs'  , len(self.index)),
          ('valid' , self.count()   ),
          ('mean'  , self.mean()    ),
          ('min'   , self.min()     ),
          ('max'   , self.max()     ),
          ('std'   , self.std()     ),
          ('10%'   , self.quantile(0.10)),
          ('25%'   , self.quantile(0.25)),
          ('50%'   , self.median()  ),
          ('75%'   , self.quantile(0.75)),
          ('90%'   , self.quantile(0.90)),
          ('skew'  , self.skew()    ),
          ('kurt'  , self.kurt()    ) ]
    s = Series(dict(l), index = [ k for k, v in l ])
    s[s.abs()<0.000001] = 0.0
    return s
Series.describe = describe
added this to the Someday milestone on Apr 30, 2014
soupault

soupault commented on Feb 10, 2016

@soupault
Contributor

median out-of-the-box would really be a great thing to have!
Not just for Series but also for GroupBy's.

jorisvandenbossche

jorisvandenbossche commented on Feb 10, 2016

@jorisvandenbossche
Member

Note that median is already in describe, as in 50%

soupault

soupault commented on Feb 10, 2016

@soupault
Contributor

@jorisvandenbossche indeed! Sorry, my bad. 👶

TomAugspurger

TomAugspurger commented on Feb 10, 2016

@TomAugspurger
Contributor

Hijacking this issue to propose a variant (and steal more from dplyr)

Why not have a DataFrame.agg that is identical to DataFrame.groupby.agg, but with a "single group"? This also goes along with the new .resample.agg, yay for synergy. Basically it takes a function/str or list of functions/strs or a dict of column names to functions/str and aggregates accordingly.

jorisvandenbossche

jorisvandenbossche commented on Feb 10, 2016

@jorisvandenbossche
Member

Related to #1623

added
ReshapingConcat, Merge/Join, Stack/Unstack, Explode
Numeric OperationsArithmetic, Comparison, and Logical operations
on Oct 24, 2016
added 2 commits that reference this issue on Nov 12, 2016
0bd99bd
b9af226

52 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementNumeric OperationsArithmetic, Comparison, and Logical operationsReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @jreback@jorisvandenbossche@TomAugspurger@soupault@tantrev

      Issue actions

        Feature request: add median, mode & number of unique entries to pandas.DataFrame.describe() · Issue #7014 · pandas-dev/pandas