Skip to content

ENH: other plotting tools via plot accessor #11978

@rhiever

Description

@rhiever
  • scatter_matrix (deprecate, redirect to seaborn)
    andrews_curve
    parallel_coordinates
    lag_plot (maybe rename to lag)
    autocorrelation_plot (maybe rename to autocorrelation?)
    bootstrap_plot (maybe rename to bootstrap?)
    radviz

I think it would be nice to allow scatter_matrix to be called directly on a DataFrame. Currently, scatter_matrix is a separate function that takes a DataFrame as a parameter, but it seems like it would be easy enough to rework to allow it to be called directly on a DataFrame as well. Effectively, the convenience function would look something like:

def scatter_matrix(self, ...):
    pandas.tools.plotting.scatter_matrix(self.data, ...)

Is this feature feasible?

Activity

TomAugspurger

TomAugspurger commented on Jan 7, 2016

@TomAugspurger
Contributor

We could probably do this as DataFrame.plot.scatter_matrix. I'm not sure why this wasn't ever folded into the df.plot(kind=) function.

Also, the standard disclaimer that seaborn does this better

jreback

jreback commented on Jan 7, 2016

@jreback
Contributor

actually prob all should be added via the plot accessor: http://pandas.pydata.org/pandas-docs/stable/visualization.html#plotting-tools

added this to the Next Major Release milestone on Jan 7, 2016
changed the title [-]Allow scatter_matrix to be called directly on a DataFrame[/-] [+]ENH: other plotting tools via plot accessor[/+] on Jan 7, 2016
shoyer

shoyer commented on Jan 7, 2016

@shoyer
Member

Something to consider: Maybe we want to deprecate some of these more complex plot types? For example, seaborn does a better job of scatter matrix than we do...

jreback

jreback commented on Jan 7, 2016

@jreback
Contributor

absolutely let's do that. any others?

bootstrap, autocorrelation, lag for statsmodels?

cc @jseabold
cc @josef-pkt

rhiever

rhiever commented on Jan 7, 2016

@rhiever
Author

@shoyer: IMO the scatter matrix is a basic plot type that people would want to see for a DataFrame.

jorisvandenbossche

jorisvandenbossche commented on Jan 7, 2016

@jorisvandenbossche
Member

I agree with @shoyer. These plotting methods are rather 'neglected' lately. I think we should either choose to give them more attention, or deprecate them. And personally I think it should not be the focus of pandas.

But maybe scatter_matrix as the IMO the most generic of the plotting methods listed can be the exception?

jreback

jreback commented on Jan 7, 2016

@jreback
Contributor

why don't we defer scatter_mattix to seaborn anyhow? so seaborn becomes an optional dep)

josef-pkt

josef-pkt commented on Jan 7, 2016

@josef-pkt

@jreback how well are optional circular dependencies no supported in various packagings?
I haven't kept up with it, but we get circular dependency pandas - seaborn - statsmodels.
I'm in favor of tight integration, in general.

I'm not a heavy pandas user directly, but I also find some quick plot methods convenient, instead of having to look for the appropriate function in another package.

about plots in statsmodels: plots are mostly in a minimal maintenance state. There is not much effort for a style update to make them look better, e.g. compared to seaborn, because of a lack of developers.
However, we need them in support of the models and other stats functions and still keep adding more.
For example, acf and pacf are some of our oldest plot functions and won't go away.

Trying to use seaborn as optional statsmodels dependency to upgrade plots is an idea but doesn't have a champion to look into it and work on it. For many plots the focus of seaborn is exploratory analysis which is more similar to the pandas use, but for us having a model inside the plot is inside out because we need plots inside the models, or after having estimated a model.

jreback

jreback commented on Jan 7, 2016

@jreback
Contributor

using seaborn (or sm using pandas is no problem here)
you can simply import inside the method itself

4 remaining items

max-sixty

max-sixty commented on Jan 7, 2016

@max-sixty
Contributor

Ideally we would remove all plotting code from pandas (except for say resampling callback and such). And just defer to various engines.

👍

modified the milestones: Contributions Welcome, 0.24.0 on Dec 23, 2018
removed this from the 0.24.0 milestone on Jan 25, 2019
Ochirgarid

Ochirgarid commented on Sep 2, 2019

@Ochirgarid

Received this issue from CodeTriage. Seems like it is already closed issue @jorisvandenbossche could you explain why you reopened the issue?

mroeschke

mroeschke commented on Apr 21, 2021

@mroeschke
Member

From the reversion of this feature in #24912 and discussion within, it appears that there's not much appetite in supporting this directly in an accessor and having a free standing function is fine. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @jseabold@josef-pkt@jreback@jorisvandenbossche@shoyer

      Issue actions

        ENH: other plotting tools via plot accessor · Issue #11978 · pandas-dev/pandas