-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Add groupby(...).agg_index #56992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It was a bit opaque to me in the seaborn example why the index was needed outside a groupby aggregation function being called. A MultiIndex composed of |
No objection, though itd be nice if matt's idea is viable and we can obviate the need for this. |
Converting the groups to a MultiIndex is not performant, and at least at current state does not arrive at the correct dtypes for e.g. EAs.
|
In the Seaborn context appears type preservation isn't important as values are cast to float in the end: https://github.com/mwaskom/seaborn/blob/a3cb0f1b33551eb25fed1907d0abdd0237fac215/seaborn/categorical.py#L640 As of now, I would lean -0 on adding |
No opposition here, will close if no one is in favor of this. |
Seaborn is of course only just one example that triggered my question (and they indeed convert to floats, because for plotting that's all you need in the end). It's always difficult to estimate how much used this would be in practice. But personally, it feels somewhat useful to me that you can get the unique group keys for a groupby object. We also already expose the indices of the groups (and my feeling is that the groups keys itself might be more often useful than the indices). If we add it, I would name maybe name it differently, though. |
I don't think there is any disagreement there. The question is whether
We already have
That's why I think |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.Ref: #56521 (comment)
With
groupby(...).grouper
deprecated, we can expose.agg_index
on the groupby object so users can still determine the index that will be on aggregations (when as_index is True).If we do go forward with this, I'd like it in 2.2.1 because of the aforementioned deprecation.
cc @jorisvandenbossche @jbrockmendel @mroeschke