Closed
Description
This dispatches to each group individually. Better to have a combined group_rank to do this. It is a bit of code and ideally would share some with the actual rank algos.
In [7]: ngroups = 1000
In [8]: N = 100000
In [9]: np.random.seed(1234)
In [10]: df = DataFrame({'key': np.random.randint(0, ngroups, size=N), 'value': np.arange(N)})
In [11]: %timeit df.groupby('key').rank()
1 loop, best of 3: 392 ms per loop
# comparision with group_shift_indexer, a transforming operator
In [13]: %timeit df.groupby('key').shift()
100 loops, best of 3: 3.15 ms per loop
Activity
WillAyd commentedon Jan 25, 2018
I can take a look at this. Any tips on what methods to explore? I was thinking of adding a method to the GroupBy class similar to the others for rank and was looking at the rank method in algos.
It wasn't immediately clear to me the best way to knit that all together so figured I'd get your thoughts if you have any
7 remaining items