Skip to content

Added webscraper for Thrift Savings Plan #157

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 6, 2016

Conversation

e2thenegpii
Copy link
Contributor

I added a webscraper to put TSP fund data into a dataframe. In order to import pandas_datareader on my machine I had to put a requirement requests>=2.3.0

@davidastephens
Copy link
Member

Thanks!

Looks like the data is somewhat unreliable (see Travis Errors - I re-ran them and got the same thing). Maybe add a check to ensure the scraper got data and a couple retries. You can then raise RemoteDataError and add a

try:
    #test code here
except RemoteDataError as e:  # pragma: no cover
    raise nose.SkipTest(e)

In the test.

Also, can you add a note in the what's new?

@e2thenegpii
Copy link
Contributor Author

I looked at why the test was failing, the CSV file that I was downloading
from TSP.gov was rather irregular regarding the placement of whitespace and
newlines. Pandas 14.1 and older didn't handle the sloppy whitespace well
which caused the error. I added a callback in the DataReader class that is
called when extracting the text from the response object. This should
enable subclasses to do things like fix up sloppy responses if needed as is
the case for TSP. The default operation is to return the response object
content.

I also added a what's new 2.3 document.

On Wed, Jan 6, 2016 at 11:37 PM, David Stephens [email protected]
wrote:

Thanks!

Looks like the data is somewhat unreliable (see Travis Errors - I re-ran
them and got the same thing). Maybe add a check to ensure the scraper got
data and a couple retries. You can then raise RemoteDataError and add a

try:
#test code here
except RemoteDataError as e: # pragma: no cover
raise nose.SkipTest(e)

In the test.

Also, can you add a note in the what's new?


Reply to this email directly or view it on GitHub
#157 (comment)
.

.. ipython:: python

import pandas_datareader.tsp as tsp
tspreader = tsp.TSPReader(start='2015-10-1',end='2015-12-31')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a space before end and after comma is missing

tspreader = tsp.TSPReader(start='2015-10-1', end='2015-12-31')

see https://www.python.org/dev/peps/pep-0008/#whitespace-in-expressions-and-statements

@davidastephens
Copy link
Member

Thanks - can you squash this and we can merge?

@davidastephens
Copy link
Member

I think this got squashed before the lint PR - can you rebase so we can merge?

Thanks.

@e2thenegpii
Copy link
Contributor Author

Everything should be set to go. Thanks for your help.

davidastephens added a commit that referenced this pull request Feb 6, 2016
Added webscraper for Thrift Savings Plan
@davidastephens davidastephens merged commit 7cd1a8d into pydata:master Feb 6, 2016
@davidastephens
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants