You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think the description of this method is incorrect. The effect of parameter "keep" says, if we pass "first" to it, means to take the first duplicate value, while we pass "last" to it, means to take the last duplicate value.
However, its real effect is to sort the duplicate values by their indexes.
Hope we put it right soon.
@thoo I think the description is correct insofar as it is order of the items in the index array. It's possible that it could be reworded to be a bit more clear like so:
- ``first`` : return the first n occurrences in given index order
- ``last`` : return the last n occurrences in reverse index order
@chris-b1 One thing that I noticed while looking into this is that when you call s.nlargest(2,keep="last") it doesn't return the items sorted by index, but in reverse order by index. For example:
>>>s = pd.Series([3, 1, 3, 2, 3,], index=['e', 'a', 'c', 'b', 'd'])
>>>s.nlargest(3, keep='last')
d 3
c 3
e 3
I feel as if this is a bug in it's own way and that it should instead be returning in the original index order i.e.
Activity
chris-b1 commentedon Dec 13, 2018
I don't think the docstring is wrong, but could be updated to note that the resulting index is sorted
[-]The mistake of Series.nlargest[/-][+]DOC: Series.nlargest sorts index with duplicates[/+]songmoo commentedon Dec 14, 2018
can I try to fix it?
nfreundlich commentedon Dec 14, 2018
did you take the issue?
thoo commentedon Dec 24, 2018
@chris-b1 I think the description for
keep
could be more clear. Currently it said:The index order might not be the same as the order of the index of the Series if the Series is not ordered.
For instance,
I would naturally assume the index order for
keep='last'
would be'd'
and'e'
instead of'c'
and'd'
by thinking of orderingindex
.Omnomnominous commentedon Jan 15, 2019
@thoo I think the description is correct insofar as it is order of the items in the index array. It's possible that it could be reworded to be a bit more clear like so:
@chris-b1 One thing that I noticed while looking into this is that when you call
s.nlargest(2,keep="last")
it doesn't return the items sorted by index, but in reverse order by index. For example:I feel as if this is a bug in it's own way and that it should instead be returning in the original index order i.e.
DOC: Fix #24268 by updating description for keep in Series.nlargest (#…
upstream sync (#1)