Skip to content

ENH: add option to suppress scientific notation (for small values?) #12374

@rosnfeld

Description

@rosnfeld
Contributor

I find myself running into a situation where I don't want to see small numbers as scientific notation fairly frequently, things like:

In [3]: pd.set_option('display.precision', 2)

In [4]: pd.DataFrame(np.random.randn(5, 5)).corr()
Out[4]: 
      0     1         2         3     4
0  1.00 -0.57  2.15e-02 -3.48e-02 -0.64
1 -0.57  1.00  2.59e-01 -5.56e-01  0.51
2  0.02  0.26  1.00e+00  2.91e-03 -0.06
3 -0.03 -0.56  2.91e-03  1.00e+00  0.36
4 -0.64  0.51 -6.21e-02  3.63e-01  1.00

or

In [16]: pd.Series(np.random.poisson(size=1000)).value_counts(normalize=True)
Out[16]: 
0    3.80e-01
1    3.63e-01
2    1.75e-01
3    5.70e-02
4    1.80e-02
5    5.00e-03
7    1.00e-03
6    1.00e-03
dtype: float64

Scientific notation isn't helpful when you are trying to make quick comparisons across elements, and have a well-defined notion of a -1 to 1 or 0 to 1 range.

I propose adding some sort of display flag to suppress scientific notation on small numbers, and just report zeros in these cases instead. Alternatively we could also suppress it on large numbers, but I am not sure how helpful that is. I usually only find myself going up against it on small numbers, in exactly the use cases (correlations, proportions) above.

Activity

rosnfeld

rosnfeld commented on Feb 17, 2016

@rosnfeld
ContributorAuthor

(and I volunteer to work on this if others are okay with the idea)

jreback

jreback commented on Feb 18, 2016

@jreback
Contributor

http://pandas.pydata.org/pandas-docs/stable/options.html#number-formatting

there are already 4 related options to do things like this:
display.precision, display.chop_threshold, display.float_format, and pd.set_eng_float_format(accuracy=3, use_eng_prefix=True).

So what I think we need is some consolidation and maybe some docs.

jreback

jreback commented on Feb 18, 2016

@jreback
Contributor

some related issues:

#9448
#6839

jreback

jreback commented on Feb 18, 2016

@jreback
Contributor

love for you to have a look to see how this can be done better

rosnfeld

rosnfeld commented on Feb 18, 2016

@rosnfeld
ContributorAuthor

Hmm, embarrassing that I hadn't seen chop_threshold before, I've made changes to display.precision and edited its docs and yet not seen this. That sounds like what I want, though I can still get it to behave poorly:

In [25]: pd.set_option('display.precision', 2)
In [26]: pd.set_option('chop_threshold', 0.01)  # maybe this should be 0.005, not sure of order of operations, but I get issues either way
...
In [30]: pd.DataFrame(np.random.randn(5, 5)).corr()
Out[30]: 
      0         1     2     3         4
0  1.00 -3.14e-01  0.07 -0.28  1.42e-01
1 -0.31  1.00e+00 -0.82 -0.35  0.00e+00
2  0.07 -8.19e-01  1.00  0.54 -4.71e-01
3 -0.28 -3.50e-01  0.54  1.00  1.21e-01
4  0.14  0.00e+00 -0.47  0.12  1.00e+00

Thanks for pointing me to it though. I'll play around with this for a while and see if there's some clean-up that can be done. I would love it I could change display.precision while working on some data and have the chop_threshold update to match rather than having to keep them in sync.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @jreback@rosnfeld@mroeschke

        Issue actions

          ENH: add option to suppress scientific notation (for small values?) · Issue #12374 · pandas-dev/pandas