Skip to content

testing.assert_frame_equal obj parameter not always respected #27023

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ajwood opened this issue Jun 24, 2019 · 5 comments · Fixed by #27068
Closed

testing.assert_frame_equal obj parameter not always respected #27023

ajwood opened this issue Jun 24, 2019 · 5 comments · Fixed by #27068
Labels
Error Reporting Incorrect or improved errors from pandas Testing pandas testing functions or related to the test suite
Milestone

Comments

@ajwood
Copy link
Contributor

ajwood commented Jun 24, 2019

Code Sample

import pandas as pd

df = pd.DataFrame({'a': [1,2], 'b': [2,3]})

df_coldiff = pd.DataFrame({'a': [1,2], 'ZZZ': [2,3]})
pd.testing.assert_frame_equal(df, df_coldiff, obj='myobj')
# raises: 
#  AssertionError: myobj.columns are different
#  
#  myobj.columns values are different (50.0 %)
#  [left]:  Index(['a', 'b'], dtype='object')
#  [right]: Index(['a', 'ZZZ'], dtype='object')


df_valuediff = pd.DataFrame({'a': [1,2], 'b': [2,999]})
pd.testing.assert_frame_equal(df, df_valuediff, obj='myobj')
# raises:
#  AssertionError: DataFrame.iloc[:, 1] are different
#  
#  DataFrame.iloc[:, 1] values are different (50.0 %)
#  [left]:  [2, 3]
#  [right]: [2, 999]

Problem description

In the first example where the column names differ, the 'myobj' label is used in the assertion message.

In the second example where the values differ, the 'myobj' label does not appear.

It looks to me like these strings should be formatted with obj instead of hardcoded with DataFrame?

https://github.com/pandas-dev/pandas/blob/master/pandas/util/testing.py#L1252
https://github.com/pandas-dev/pandas/blob/master/pandas/util/testing.py#L1267

Expected Output

I expected the assertion message for the second example to be:

#  AssertionError: myobj.iloc[:, 1] are different
#  
#  myobj.iloc[:, 1] values are different (50.0 %)
#  [left]:  [2, 3]
#  [right]: [2, 999]

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.7.1.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-52-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: en_CA.UTF-8

pandas: 0.24.2
pytest: 4.4.1
pip: 19.1
setuptools: 41.0.1
Cython: 0.29.7
numpy: 1.16.3
scipy: 1.2.1
pyarrow: None
xarray: None
IPython: 7.5.0
sphinx: 2.0.1
patsy: 0.5.1
dateutil: 2.8.0
pytz: 2019.1
blosc: None
bottleneck: 1.2.1
tables: 3.5.1
numexpr: 2.6.9
feather: None
matplotlib: 3.0.3
openpyxl: 2.6.1
xlrd: 1.2.0
xlwt: 1.3.0
xlsxwriter: 1.1.7
lxml.etree: 4.3.3
bs4: 4.7.1
html5lib: 1.0.1
sqlalchemy: 1.3.5
pymysql: None
psycopg2: None
jinja2: 2.10.1
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@WillAyd
Copy link
Member

WillAyd commented Jun 25, 2019

I don't think this is really intended as part of the API but sure seems like a reasonable change. Want to make a PR?

@WillAyd WillAyd added Error Reporting Incorrect or improved errors from pandas Testing pandas testing functions or related to the test suite labels Jun 25, 2019
@ajwood
Copy link
Contributor Author

ajwood commented Jun 26, 2019

Sure, I can issue a PR. It's my first time pulling/building/running the pandas project though, so it might take me a while to figure it out.

I don't think this is really intended as part of the API

Out of curiosity, could you elaborate? Am I looking at this the wrong way, if I see this as a clear bug in the assertion message?

@WillAyd
Copy link
Member

WillAyd commented Jun 26, 2019

Out of curiosity, could you elaborate?

I'm just looking at the docstring:

Specify object name being compared, internally used to show appropriate
assertion message.

So never really intended for general purpose usage but I think simple enough to change

based off research you've already done

Sure, I can issue a PR. It's my first time pulling/building/running the pandas project though, so it might take me a while to figure it out.

Be sure to check out our contributing guide if you have not already:

https://pandas.pydata.org/pandas-docs/stable/development/contributing.html#contributing-to-pandas

@ajwood
Copy link
Contributor Author

ajwood commented Jun 26, 2019

Thanks, looking at it now. Do you think the commit message prefix for this change should be TST or BUG?

@WillAyd
Copy link
Member

WillAyd commented Jun 26, 2019

TST works

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants