Skip to content

A better url builder #78

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
femtotrader opened this issue Aug 24, 2015 · 3 comments
Closed

A better url builder #78

femtotrader opened this issue Aug 24, 2015 · 3 comments
Milestone

Comments

@femtotrader
Copy link
Contributor

Hello,

I noticed that each data reader have it's own way to build URL.

That's sometimes very ugly.

For example there is things like (yahoo/components.py)

idx_mod = idx_sym.replace('^', '@%5E')

the use of urlencode should probably be prefered

pylint doesn't like in yahoo/actions.py (tabulations)

url = (_URL + 's=%s' % symbol + \
            '&a=%s' % (start.month - 1) + \
            '&b=%s' % start.day + \
            '&c=%s' % start.year + \
            '&d=%s' % (end.month - 1) + \
            '&e=%s' % end.day + \
            '&f=%s' % end.year + \
            '&g=v')

and many others... (a global lint would be required)

Maybe we should add in utils a function to build url from a base URL (and endpoint) and a dictionary of parameters and encourage each data reader developer to use it.

Such a function could be

def _encode_url(url, params):
    """
    Return encoded url with parameters
    """
    s_params = urlencode(params)
    if len(s_params)!=0:
        return url + '?' + s_params
    else:
        return url

So in google/daily.py we could have

params = {
    'q': sym,
    'startdate': start.strftime('%b %d, %Y'),
    'enddate': end.strftime('%b %d, %Y'),
    'output': "csv"
}
url = _encode_url(_URL, params)
return _retry_read_url(url, retry_count, pause, 'Google')

instead of

url = "%s%s" % (_URL,
                urlencode({"q": sym,
                           "startdate": start.strftime('%b %d, ' '%Y'),
                           "enddate": end.strftime('%b %d, %Y'),
                           "output": "csv"}))

(I think '%b %d, ' '%Y' is a typo it should be '%b %d, %Y')

or in yahoo/daily.py instead of

url = (_URL + 's=%s' % sym +
       '&a=%s' % (start.month - 1) +
       '&b=%s' % start.day +
       '&c=%s' % start.year +
       '&d=%s' % (end.month - 1) +
       '&e=%s' % end.day +
       '&f=%s' % end.year +
       '&g=%s' % interval +
       '&ignore=.csv')

we could have

params = {
    's': sym,
    'a': start.month - 1,
    'b': start.day,
    'c': start.year,
    'd': end.month - 1,
    'e': end.day,
    'f': end.year,
    'g': interval,
    'ignore': '.csv'
}
url = _encode_url(_URL, params)

WorldBank API url create could also be improved:

in _get_data, instead of

url = ("http://api.worldbank.org/countries/" + countries + "/indicators/" +
       indicator + "?date=" + str(start) + ":" + str(end) +
       "&per_page=25000&format=json")

"http://api.worldbank.org" is often repeat

in such a url there is 3 parts:

base url: "http://api.worldbank.org"
endpoint: "/countries/US/indicators/NY.GNS.ICTR.GN.ZS" (for example)
parameters: date=2012:2015&per_page=25000&format=json

We shouldn't repeat ourself
https://en.wikipedia.org/wiki/Don%27t_repeat_yourself

we could have

global

_BASE_URL = 'http://api.worldbank.org'

in get_data

    endpoint = "/countries/{countries}/indicators/{indicator}"\
        .format(countries=countries, indicator=indicator)
    params = {
        'date': "%s:%s" % (start, end),
        'per_page': 25000,
        'format': 'json'
    }
    url = _encode_url(_BASE_URL + endpoint, params)

Maybe we might define how data reader "should" build URLs.

Kind regards

@davidastephens
Copy link
Member

I don't think we necessarily need standards, but happy to have a pull request that cleans it up..

@davidastephens davidastephens added this to the 0.2.0 milestone Aug 25, 2015
@femtotrader
Copy link
Contributor Author

I'm facing some difficulties to squash commits.

@davidastephens
Copy link
Member

Closed via #83 and #79.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants