Description
There haven been previous attempts to reduce the length of the Series/DataFrame repr (pandas.options.display.max_rows
), eg #20514. Related pandas-dev email: https://mail.python.org/pipermail/pandas-dev/2018-March/000732.html
In that discussion, I once made the following proposal to introduce two thresholds:
- We have 2 thresholds instead of 1 (the current 'max_rows'): a number of
rows to show in a truncated repr, and a max number of rows to show
without truncating - For 'big' dataframes, we show a truncated repr. And then I would go even
lower than 20 and only show first/last 5 (so like a max_rows of 10) - For 'small' dataframes, we show the full dataframe without truncating, up
to the threshold.
We would still need to define those two thresholds. But for example, using the current max_rows of 60: we could show a full repr up to 60 rows, and once the number of rows > 60, we only show 10 (first/last 5).
You can then still set both thresholds at the same number (like 20, as in the linked PR above) to not get this variable behaviour.
This is actually similar to what numpy arrays do (but with a bigger threshold: eg np.random.randn(1000) shows all 1000 elements, np.random.randn(1001) shows the first/lst 3).
And it is also very similar to what R tibbles do: they have a "print_min" and "print_max" options with exactly this behaviour, only their "print_max" is lower (it's 10 and 20, respectively):
options(tibble.print_max = n, tibble.print_min = m): if there are more than
n rows, print only the first m rows. Use options(tibble.print_max = Inf)
to always show all rows.
Activity
simonjayhawkins commentedon Jun 23, 2019
so this would add two display options
pandas.options.display.min_rows
andpandas.options.display.min_columns
and two arguments, to
to_string
andto_html(notebook=True)
;min_rows
andmin_cols
?personally, since this proposal is display related and not so relevant to IO, I would prefer not to see the additional arguments to
to_string
andto_html
jorisvandenbossche commentedon Jun 23, 2019
I would personally only start with
min_rows
(we could always add the columns one later if there is demand for it).And also personally for me, I am fine with only adding it as a general display option for now, and not necessarily to
to_string
/to_html
.simonjayhawkins commentedon Jun 23, 2019
+1 in that case.
TomAugspurger commentedon Jun 23, 2019
+1 from me as well.
jorisvandenbossche commentedon Jul 3, 2019
cc @pandas-dev/pandas-core we might still include this in 0.25.0. Any concerns about the above proposal?
shoyer commentedon Jul 3, 2019
+1 sounds great to me
toobaz commentedon Jul 3, 2019
+1 for me too
topper-123 commentedon Jul 3, 2019
I'm +1.