-
-
Notifications
You must be signed in to change notification settings - Fork 32k
gh-96035: Make urllib.parse.urlparse reject non-numeric ports #98273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll also need new tests for this behavior.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
I think it makes way more sense to move port validation entirely to parse time, a la #25774, so it might not make sense to implement these fixes until that happens. What's the protocol for PRs that depend on other PRs? Should I make a PR into @gpshead's fork, or would you rather I submit this in isolation and have other adjust for it? Thanks! |
I'd also want to hear @gpshead's opinion, and I haven't looked at #25774 in enough detail to understand the problem it's trying to solve. However, the other PR has been open for a long time and looks like a more invasive change, so we may want to apply it only on main, while this PR is a more self-contained bugfix that we'll be able to backport to 3.10 and 3.11. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a unittest that would trigger this error code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's another case to consider: Unicode digits! For example, your patch still allows a port of ६
, which is the Devanagari character for 6. Your quote from the RFC on the issue says only ASCII digits are allowed.
Great catch! I had no idea that things like |
Misc/NEWS.d/next/Library/2022-10-14-19-57-37.gh-issue-96035.0xcX-p.rst
Outdated
Show resolved
Hide resolved
…cX-p.rst Co-authored-by: Jelle Zijlstra <[email protected]>
Misc/NEWS.d/next/Library/2022-10-14-19-57-37.gh-issue-96035.0xcX-p.rst
Outdated
Show resolved
Hide resolved
Thanks @kenballus for the PR, and @JelleZijlstra for merging it 🌮🎉.. I'm working now to backport this PR to: 3.10. |
Thanks @kenballus for the PR, and @JelleZijlstra for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11. |
…ythonGH-98273) Co-authored-by: Jelle Zijlstra <[email protected]> (cherry picked from commit 6f15ca8) Co-authored-by: Ben Kallus <[email protected]>
GH-98498 is a backport of this pull request to the 3.10 branch. |
GH-98499 is a backport of this pull request to the 3.11 branch. |
…ythonGH-98273) Co-authored-by: Jelle Zijlstra <[email protected]> (cherry picked from commit 6f15ca8) Co-authored-by: Ben Kallus <[email protected]>
Co-authored-by: Jelle Zijlstra <[email protected]> (cherry picked from commit 6f15ca8) Co-authored-by: Ben Kallus <[email protected]>
Co-authored-by: Jelle Zijlstra <[email protected]> (cherry picked from commit 6f15ca8) Co-authored-by: Ben Kallus <[email protected]>
python 3.10.9 introduced python/cpython#98273 which fixes a bug in `urllib.parse.urlparse()` where certain port numbers containing whitespaces or other non-ASCII digits to be erroneously allowed. But it also changed the error message that was expected in one of our unit tests. So update our unit tests to expect the new error message. NOTE: the new error message is slightly confusing for certain port numbers (e.g -1) due to its use of `str.isdigit()` to flag whether the port could be cast as an int. Signed-off-by: Hamzah Qudsi <[email protected]>
python 3.10.9 introduced python/cpython#98273 which fixes a bug in `urllib.parse.urlparse()` where certain port numbers containing whitespaces or other non-ASCII digits to be erroneously allowed. But it also changed the error message that was expected in one of our unit tests. So update our unit tests to expect the new error message. NOTE: the new error message is slightly confusing for certain port numbers (e.g -1) due to its use of `str.isdigit()` to flag whether the port could be cast as an int. Signed-off-by: Hamzah Qudsi <[email protected]>
python 3.10.9 introduced python/cpython#98273 which fixes a bug in `urllib.parse.urlparse()` where certain port numbers containing whitespaces or other non-ASCII digits to be erroneously allowed. But it also changed the error message that was expected in one of our unit tests. So update our unit tests to expect the new error message. NOTE: the new error message is slightly confusing for certain port numbers (e.g -1) due to its use of `str.isdigit()` to flag whether the port could be cast as an int. Signed-off-by: Hamzah Qudsi <[email protected]>
python 3.10.9 introduced python/cpython#98273 which fixes a bug in `urllib.parse.urlparse()` where certain port numbers containing whitespaces or other non-ASCII digits to be erroneously allowed. But it also changed the error message that was expected in one of our unit tests. So update our unit tests to expect the new error message. NOTE: the new error message is slightly confusing for certain port numbers (e.g -1) due to its use of `str.isdigit()` to flag whether the port could be cast as an int. Signed-off-by: Hamzah Qudsi <[email protected]>
urllib.parse.urlparse uses
int
to parse port numbers, which means they can contain signs, underscores, and whitespace. This patch adds a check to ensure that only numeric port numbers parse without error.