Skip to content

fix 'null' in lower case as a missing value in the new Yahoo API, make compatible with the older pandas versions #357

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jul 6, 2017

Conversation

OlegShteynbuk
Copy link
Contributor

@OlegShteynbuk OlegShteynbuk commented Jul 3, 2017

closes #342

add 'null' in lower case to na_values argument to read_csv

@jreback
Copy link
Contributor

jreback commented Jul 3, 2017

can u add the test from the issue so that we can see that this works

@OlegShteynbuk
Copy link
Contributor Author

add 'null' in lower case to na_values arguments

@OlegShteynbuk
Copy link
Contributor Author

OlegShteynbuk commented Jul 3, 2017

the test added

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add a note in the whatsnew for 0.5

@@ -132,6 +132,11 @@ def test_get_data_multiple_symbols(self):
web.get_data_yahoo(sl, '2012')

@skip_on_exception(RemoteDataError)
def test_get_data_null_as_missing_data(self):
# just test that we succeed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add both cases and assert the dtypes of the result

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@OlegShteynbuk
Copy link
Contributor Author

a note in the whatsnew for 0.5 added

@OlegShteynbuk OlegShteynbuk changed the title add 'null' in lower case to na_values argument to read_csv fix 'null' in lower case as a missing value in a new Yahoo API Jul 3, 2017
@OlegShteynbuk OlegShteynbuk changed the title fix 'null' in lower case as a missing value in a new Yahoo API fix 'null' in lower case as a missing value in the new Yahoo API Jul 3, 2017
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small test change, otherwise lgtm.

@@ -132,6 +132,18 @@ def test_get_data_multiple_symbols(self):
web.get_data_yahoo(sl, '2012')

@skip_on_exception(RemoteDataError)
def test_get_data_null_as_missing_data_no_adjust(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parametrize this with adjust_prices=True & False (and remove the test below)

@jreback
Copy link
Contributor

jreback commented Jul 4, 2017

FYI we are testing on older pandas versions and such so no special handling, everything should work. null should be fine in newer (as its default) and older (as you added it).

@jreback jreback added this to the 0.5.0 milestone Jul 4, 2017
@jreback jreback added the bug label Jul 4, 2017
@OlegShteynbuk
Copy link
Contributor Author

Added parametrize. In order to use parametrize created an inner function(added a level of indirection) because skip_on_exception decorator changes the signature. The other solution would be to make skip_on_exception decorator a signature-preserving decorator. But this is outside of the scope of this PR as it will affect other tests; my first impression is that it will affect other tests for the better as will allow using skip_on_exception decorator with parametrize, it could be a little bit slower but the difference is rather small and for testing it should not be an issue. I can go ahead and create a new issue.

@@ -133,7 +133,7 @@ def test_get_data_multiple_symbols(self):

@pytest.mark.parametrize('adj_pr', [True, False])
def test_get_data_null_as_missing_data(self, adj_pr):
#workaround as skip_on_exception decorator changes signature
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does using a fixture fix this?

@pytest.fixture(params=[True,False], scope='module')
def adj_pr(request)
    return request.param

@skip_on_exception(RemoteDataError)
def test_get_data_null_as_missing_data(self, adj_pr):
    ...

Copy link
Contributor Author

@OlegShteynbuk OlegShteynbuk Jul 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bashtage thanks for the code sample interesting usage of fixtures, wasn't able to run it successfully, but I am no pytest expert just a casual user. This approach has an additional level of indirection(a new fixture) and probably requires some advanced pytest knowledge, but a signature-preserving decorator hopefully should work just like any other decorator, just put one line in front of a function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @bashtage gave you everything there, so there is nothing else to do but copy and paste. What happens when you do this?

@OlegShteynbuk OlegShteynbuk changed the title fix 'null' in lower case as a missing value in the new Yahoo API fix 'null' in lower case as a missing value in the new Yahoo API, make compatible with the older pandas versions Jul 5, 2017
@pytest.mark.parametrize('adj_pr', [True, False])
def test_get_data_null_as_missing_data(self, adj_pr):
# workaround as skip_on_exception decorator changes signature
# should it be changed to a signature-preserving decorator?
Copy link
Contributor

@gfyoung gfyoung Jul 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decorators generally are difficult to make signature-preserving AFAIK, which is why I didn't bother to try when I created it 😄 . Does pytest and this decorator not work well together? Did you try using skip_on_exception above the pytest.mark one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's true that signature-preserving decorators are difficult to make but there are at least two packages that I was planning on trying.

I have tried all combinations of skip_on_exception'and the pytest.mark above and below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when you try either? (i.e. what stacktrace do you get?)

Copy link
Contributor Author

@OlegShteynbuk OlegShteynbuk Jul 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bellow is the error message

E TypeError: test_null_as_missing_data() takes exactly 2 arguments (1 given)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it will work with global functions not when they are inside the class, not sure

Copy link
Contributor Author

@OlegShteynbuk OlegShteynbuk Jul 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have looked at the decorator package not sure how to use it in our case, but there is another one that I can use too, can create an issue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep no hurry at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have looked at the decorator package not sure how to use it in our case,

IMO, using another package seems a bit overkill. I imagine that the fix for this issue shouldn't need to require another package just for this one decorator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I didn't make it clear but it meant instead of a decorator package, one or the other, not an extra one in addition to the decorator package

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that actually clarified what you were saying...but no worries, so long as the PR gets emerged eventually 😄

@@ -134,7 +134,7 @@ def test_get_data_multiple_symbols(self):
@pytest.mark.parametrize('adj_pr', [True, False])
def test_get_data_null_as_missing_data(self, adj_pr):
# workaround as skip_on_exception decorator changes signature
# should it be changed to a signature-preserving decorator?
# TODO: should it be changed to a signature-preserving decorator?
Copy link
Contributor

@gfyoung gfyoung Jul 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that signature preservation is difficult, I think a better message would be this:

def test_..
  # TODO: We can't decorate the test function on top of
  # parametrization because it errors.  Investigate this later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have tried both ways on top and below

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def test_..
  # TODO: We can't decorate the test function along with
  # parametrization because it errors.  Investigate this later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks; it's definitely better comments

@bashtage
Copy link
Contributor

bashtage commented Jul 5, 2017

FWIW, this works as expected

Outside the class

@pytest.fixture(params=[True, False])
def adj_pr(request):
    return request.param

And the test function -- obviously one would remove the intentional raise RemoteDataError('This is a test')

    @skip_on_exception(RemoteDataError)
    def test_get_data_null_as_missing_data(self, adj_pr):
        result = web.get_data_yahoo('SRCE', '20160626', '20160705',
                                    adjust_price=adj_pr)
        # sanity checking
        assert result.dtypes.all() == np.floating
        raise RemoteDataError('This is a test')

@jreback
Copy link
Contributor

jreback commented Jul 5, 2017

@OlegShteynbuk if you want to change to use a fixture as @bashtage suggest would be great. ping on green.

@OlegShteynbuk
Copy link
Contributor Author

OlegShteynbuk commented Jul 5, 2017

I have tried @bashtage suggestion before with the fixture outside the class and inside the class, not sure what was the problem, maybe my environment is different, I have python 2.7.13 and pytest 3.0.7 on LInux, not sure what was his testing environment, maybe python 3; need to take a break...

It could be an interesting psychological experiment to find out if a preference for the solution depends on one's background; for me coming from Java and C++ encapsulation is a big deal and having a nested function makes it all nicely encapsulated in one place instead of having two functions in two different places; but it's always a good idea to look at the same problem from a different point of view, I already learned a lot about pytest from this discussion, thanks.

@gfyoung
Copy link
Contributor

gfyoung commented Jul 6, 2017

I think in this case I would go with what @bashtage implemented. Test fixtures are meant to be used across tests, so it makes sense to share this resource and not encapsulate. We have plenty of instances of encapsulation (functions within functions) in the codebase though.

Also, I'm a little surprised you went for inner functions given your Java and C++ background, since IINM, neither supports explicit declarations of inner functions. Sure, you have lambdas, but they are anonymous functions instead of the explicit one that you defined currently.

@OlegShteynbuk
Copy link
Contributor Author

I have repeated the @bashtage test with pretty much the same results, doesn't run in my environment, below is the output and i have run it beforein different combinations, there are actually two problems as it says collected 21 items but it should be 22, and if you look at the output there is only one record for test_get_data_null_as_missing_data instead of two for True and False
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_get_data_null_as_missing_data <- pandas_datareader/_testing.py FAILED

below is the output

========================================================================

platform linux2 -- Python 2.7.13, pytest-3.0.7, py-1.4.31, pluggy-0.4.0 -- /home/oleg/programs/python/anaconda2/envs/pandas-datareader_dev/bin/python
cachedir: .cache
rootdir: /home/oleg/projects/openSource/python/pandas-datareader/pandas-datareader, inifile:
plugins: xdist-1.16.0
collected 21 items

pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_yahoo <- pandas_datareader/_testing.py PASSED
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_get_data_multiple_symbols <- pandas_datareader/_testing.py PASSED
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_get_data_adjust_price <- pandas_datareader/_testing.py PASSED
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_get_data_multiple_symbols_two_dates <- pandas_datareader/_testing.py PASSED
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_get_data_single_symbol <- pandas_datareader/_testing.py PASSED
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_yahoo_DataReader_multi <- pandas_datareader/_testing.py PASSED
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_yahoo_reader_class <- pandas_datareader/_testing.py PASSED
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_get_data_null_as_missing_data <- pandas_datareader/_testing.py FAILED
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_yahoo_fails PASSED
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_get_quote_series PASSED
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_get_quote_string PASSED
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_get_quote_stringlist PASSED
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_get_quote_comma_name PASSED
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_get_components_dow_jones SKIPPED
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_get_components_dax SKIPPED
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_get_components_nasdaq_100 SKIPPED
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_get_data_interval xfail
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_get_date_ret_index XPASS
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_get_data_yahoo_actions xfail
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_get_data_yahoo_actions_invalid_symbol PASSED
pandas_datareader/tests/yahoo/test_yahoo.py::TestYahoo::test_yahoo_DataReader xfail

==================================================================================================== FAILURES ====================================================================================================
__________________________________________________________________________________ TestYahoo.test_get_data_null_as_missing_data __________________________________________________________________________________

args = (<pandas_datareader.tests.yahoo.test_yahoo.TestYahoo object at 0x7f03d1549890>,), kwargs = {}

@wraps(f)
def wrapper(*args, **kwargs):
    try:
      f(*args, **kwargs)

E TypeError: test_get_data_null_as_missing_data() takes exactly 2 arguments (1 given)

pandas_datareader/_testing.py:26: TypeError
===================================================================== 1 failed, 13 passed, 3 skipped, 3 xfailed, 1 xpassed in 41.76 seconds ======================================================================

@OlegShteynbuk
Copy link
Contributor Author

Also, I'm a little surprised you went for inner functions given your Java and C++ background, since IINM, neither supports explicit declarations of inner functions. Sure, you have lambdas, but they are anonymous functions instead of the explicit one that you defined currently.
=======================

there are also local and anonymous classes in Java that can be used to define a member function.
Was thinking more about the programming style that I was using most of the time in Java/C++, and also looked at this task as a standalone task not intended for reuse.

@OlegShteynbuk
Copy link
Contributor Author

on a different note, probably a newbie github question, what does ping on green means, don't see any green buttons available for me to push on.

@gfyoung
Copy link
Contributor

gfyoung commented Jul 6, 2017

there are also local and anonymous classes in Java that can be used to define a member function

But they're anonymous nonetheless 😄

What @jreback means by pinging on green is this: when all of the tests pass (as they have now), just write another comment like this:

@jreback : Everything is green and ready to go!

@OlegShteynbuk
Copy link
Contributor Author

@jreback : Everything is green and ready to go!

@bashtage
Copy link
Contributor

bashtage commented Jul 6, 2017

@OlegShteynbuk is right -- Python 2.7 doesn't correctly preserve the function signature when decorated, and so pytest doesn't realize that it needs pass the fixture in. This results in the wrong number of arguments being passed, and an error occurs.

Here it fails (2.7)

https://travis-ci.org/bashtage/pandas-datareader/builds/250677402

While it works fine on all Python 3.x builds. Yet another reason for 2.7 pass to oblivion.

@jreback jreback merged commit cdd8b4d into pydata:master Jul 6, 2017
@jreback
Copy link
Contributor

jreback commented Jul 6, 2017

thanks!

pls open an issue to change the decorator as well.

@OlegShteynbuk
Copy link
Contributor Author

issue #363

jreback added a commit to jreback/pandas-datareader that referenced this pull request Jul 10, 2017
jreback added a commit to jreback/pandas-datareader that referenced this pull request Jul 10, 2017
jreback added a commit that referenced this pull request Jul 10, 2017
* TST: fix testing for yahoo get_data_null_as_missing comparisons

xref #357

* Revert "CI: use latest pandas"

This reverts commit 3b8b11a.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exception downloading yahoo historical data with adjust_price=True on fix-yahoo branch
4 participants