-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Enlarging DataFrame with 2d indexer converts ints to floats #47503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is the actual output:
|
And this is ctot before I do the sums:
|
Thanks @FlorinAndrei for the report. Note that the row expansion with a 2d indexer for # single column DataFrame
df = pd.DataFrame([70, 150], dtype="i8")
df.loc["Total"] = df.sum(axis=0)
print(df.dtypes)
# 0 int64
# dtype: object
df = pd.DataFrame([70, 150], dtype="i8")
df.loc["Total", :] = df.sum(axis=0)
print(df.dtypes)
# 0 float64
# dtype: object
df = pd.DataFrame([70, 150], dtype="i8")
df.loc[:, "Total"] = df.sum(axis=1)
print(df.dtypes)
# 0 int64
# Total int64
# dtype: object
# # multi-column DataFrame
df = pd.DataFrame([[70, 150], [66, 81]], dtype="i8")
df.loc["Total"] = df.sum(axis=0)
print(df.dtypes)
# 0 int64
# 1 int64
# dtype: object
df = pd.DataFrame([[70, 150], [66, 81]], dtype="i8")
df.loc["Total", :] = df.sum(axis=0)
print(df.dtypes)
# 0 float64
# 1 float64
# dtype: object
df = pd.DataFrame([[70, 150], [66, 81]], dtype="i8")
df.loc[:, "Total"] = df.sum(axis=1)
print(df.dtypes)
# 0 int64
# 1 int64
# Total int64
# dtype: object The result dtype from df.sum() is correct, so the title of this issue is misleading and the MRE could be reduced to df = pd.DataFrame([70, 150], dtype="i8")
df.loc["Total", :] = pd.Series([220], dtype="i8")
df.dtypes
# 0 float64
# dtype: object so it appears to be an indexing issue i.e. setting values with contributions and PRs to fix welcome |
this is as expected and nothing to be done here |
can you assign this issue to me? |
@jreback - why should expanding an integer column with an integer value coerce to float? |
In case the examples are not clear, does not coerce to float with 1d indexer df = pd.DataFrame([70, 150], dtype="i8")
df.loc["Total"] = df.sum(axis=0)
print(df.dtypes)
# 0 int64
# dtype: object does coerce to float with 2d indexer df = pd.DataFrame([70, 150], dtype="i8")
df.loc["Total", :] = df.sum(axis=0)
print(df.dtypes)
# 0 float64
# dtype: object and only happens with looks like a bug to me. These should both produce the same output? |
Happens because of the reindex we are doing on the DataFrame before adding the values |
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
I generate a dataframe from an
np.array()
of integers. I want to calculate all the totals: by row, by column, and the grand total.The code works, but the output is the float type.
Expected Behavior
Since all inputs are integers and I only add the numbers, I expected the output dataframe to also be only integers.
Installed Versions
INSTALLED VERSIONS
commit : e8093ba
python : 3.9.9.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19044
machine : AMD64
processor : Intel64 Family 6 Model 158 Stepping 10, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252
pandas : 1.4.3
numpy : 1.23.0
pytz : 2021.3
dateutil : 2.8.2
setuptools : 62.6.0
pip : 22.1.2
Cython : 0.29.30
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.7.1
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.4.0
pandas_datareader: None
bs4 : 4.11.1
bottleneck : None
brotli : None
fastparquet : 0.8.1
fsspec : 2021.11.1
gcsfs : None
markupsafe : 2.1.1
matplotlib : 3.5.2
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 8.0.0
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.8.1
snappy : None
sqlalchemy : None
tables : None
tabulate : 0.8.10
xarray : 0.20.2
xlrd : 2.0.1
xlwt : None
zstandard : None
The text was updated successfully, but these errors were encountered: