-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: groupby-transform produces NaN for series keys with as_index=False #37093
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The issue seems to occur with this line in generic.py added in June |
Thanks for the report! Agreed that this is an issue, but I don't think the line identified is the cause. This path relies on computing the aggregation and then broadcasting the result to the frame. Tangential to this, the code
gives
whereas I think "X" should be added as a column to the result. Resolving this won't impact this issue however (and in fact, may make the output worse!). Will open a new issue after a little more investigating. |
The OP now produces
and this looks like the expected result to me. |
First time contributor going to try to write test case for this |
take |
Hi, I have created an pr with the tests. Can I request for some help reviewing it? thank you! |
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
Problem description
With version 1.1.3 this produces a dataframe containing NaN.
Expected Output
Expected is 1.0 and this is also what older versions produce (1.0.5).
It seems the following conditions have to be satisfied to trigger this bug:
pd.Series([0])
works though it seems accidental)as_index=False
transform
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : db08276
python : 3.6.6.final.0
python-bits : 64
OS : Linux
OS-release : 4.19.76-linuxkit
Version : #1 SMP Tue May 26 11:42:35 UTC 2020
machine : x86_64
processor :
byteorder : little
LC_ALL : en_US.UTF-8
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.1.3
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 19.2.3
setuptools : 41.2.0
Cython : 0.29.15
pytest : 5.4.3
hypothesis : 5.6.0
sphinx : 1.8.5
blosc : None
feather : None
xlsxwriter : 1.2.8
lxml.etree : 4.5.0
html5lib : None
pymysql : None
psycopg2 : 2.8.5 (dt dec pq3 ext lo64)
jinja2 : 2.10.3
IPython : 7.9.0
pandas_datareader: None
bs4 : 4.9.1
bottleneck : 1.3.2
fsspec : 0.6.2
fastparquet : None
gcsfs : None
matplotlib : 3.2.1
numexpr : 2.7.1
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.17.1.post2
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.2.1
sqlalchemy : 1.3.15
tables : None
tabulate : 0.8.6
xarray : None
xlrd : 1.2.0
xlwt : None
numba : 0.48.0
The text was updated successfully, but these errors were encountered: