Skip to content

bpo-32861: urllib.robotparser fix incomplete __str__ methods. #5711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 14, 2018
Merged

bpo-32861: urllib.robotparser fix incomplete __str__ methods. #5711

merged 3 commits into from
May 14, 2018

Conversation

michael-lazar
Copy link
Contributor

@michael-lazar michael-lazar commented Feb 17, 2018

The RobotFileParser's string representation was incomplete and missing some valid rule lines.

https://bugs.python.org/issue32861

https://bugs.python.org/issue32861

The RobotFileParser's string representation was incomplete and missing
some valid rule lines.
Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I have added just few suggestions. And the two unnecessary trailing newlines should be kept in maintained releases.

Please add a news entry.

@@ -190,7 +190,10 @@ def request_rate(self, useragent):
return self.default_entry.req_rate

def __str__(self):
return ''.join([str(entry) + "\n" for entry in self.entries])
ret = [str(entry) for entry in self.entries]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be faster:

entries = self.entries
if self.default_entry is not None:
    entries = entries + [self.default_entry]
return '\n\n'.join(map(str, entries))

@@ -222,10 +225,14 @@ def __init__(self):
def __str__(self):
ret = []
for agent in self.useragents:
ret.extend(["User-agent: ", agent, "\n"])
ret.append("User-agent: {0}".format(agent))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f-strings can be used in 3.6+.

for line in self.rulelines:
ret.extend([str(line), "\n"])
return ''.join(ret)
ret.append(str(line))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just

ret.extend(map(str, self.rulelines))

@michael-lazar
Copy link
Contributor Author

michael-lazar commented Apr 3, 2018

All suggestions have been implemented, I have no strong opinions on any of them. I also added a news entry.

Is backporting part of this PR, or do you merge this into 3.8 and then create separate issues for the other python branches? Is that something that I can help with?

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Just add your credits.

@@ -0,0 +1,3 @@
The urllib.robotparser's ``__str__`` representation now includes wildcard
entries and the "Crawl-delay" and "Request-rate" fields. Also removes extra
newlines that were being appended to the end of the string.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add "Patch by yourname." and add your name into Misc/ACKS.

@michael-lazar
Copy link
Contributor Author

Cool, I added my name to the news entry. I'm already in the ACKS file so all good there.

@serhiy-storchaka serhiy-storchaka merged commit bd08a0a into python:master May 14, 2018
@miss-islington
Copy link
Contributor

Thanks @michael-lazar for the PR, and @serhiy-storchaka for merging it 🌮🎉.. I'm working now to backport this PR to: 2.7, 3.6, 3.7.
🐍🍒⛏🤖

@bedevere-bot
Copy link

GH-6795 is a backport of this pull request to the 3.7 branch.

@miss-islington
Copy link
Contributor

Sorry, @michael-lazar and @serhiy-storchaka, I could not cleanly backport this to 2.7 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker bd08a0af2d88c590ede762102bd42da3437e9980 2.7

@bedevere-bot
Copy link

GH-6796 is a backport of this pull request to the 3.6 branch.

@serhiy-storchaka serhiy-storchaka removed their assignment Dec 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants