Skip to content

urllib.parse.urlparse doesn't check port #88037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
p-alik mannequin opened this issue Apr 16, 2021 · 5 comments
Open

urllib.parse.urlparse doesn't check port #88037

p-alik mannequin opened this issue Apr 16, 2021 · 5 comments
Assignees
Labels
3.10 only security fixes stdlib Python modules in the Lib dir

Comments

@p-alik
Copy link
Mannequin

p-alik mannequin commented Apr 16, 2021

BPO 43871
Nosy @orsenthil, @vstinner, @tirkarthi, @p-alik, @miguendes
PRs
  • gh-88037: Move port validation logic to parsing time #25774
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/orsenthil'
    closed_at = None
    created_at = <Date 2021-04-16.17:30:08.679>
    labels = ['3.10']
    title = "urllib.parse.urlparse doesn't check port"
    updated_at = <Date 2021-05-01.09:34:26.557>
    user = 'https://github.com/p-alik'

    bugs.python.org fields:

    activity = <Date 2021-05-01.09:34:26.557>
    actor = 'miguendes'
    assignee = 'orsenthil'
    closed = False
    closed_date = None
    closer = None
    components = []
    creation = <Date 2021-04-16.17:30:08.679>
    creator = 'palik'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 43871
    keywords = ['patch']
    message_count = 5.0
    messages = ['391238', '391242', '391265', '391282', '392577']
    nosy_count = 5.0
    nosy_names = ['orsenthil', 'vstinner', 'xtreak', 'palik', 'miguendes']
    pr_nums = ['25774']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue43871'
    versions = ['Python 3.10']

    @p-alik
    Copy link
    Mannequin Author

    p-alik mannequin commented Apr 16, 2021

    It is possible to get valid ParseResult from the urlparse function even for a non-numeric port value. Only by requesting the port it fails[1].
    Would it be an improvement if _checknetloc[2] validates the value of port properly?

    // code snippet
    Python 3.8.5 (default, Jan 27 2021, 15:41:15) 
    [GCC 9.3.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from urllib.parse import urlparse
    >>> uri = 'xx://foo:bar'
    >>> uri_parts = urlparse(uri)
    >>> uri_parts.netloc
    'foo:bar'
    >>> uri_parts.port
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python3.8/urllib/parse.py", line 174, in port
        raise ValueError(message) from None
    ValueError: Port could not be cast to integer value as 'bar'
    // code snippet

    [1]

    port = int(port, 10)

    [2]
    def _checknetloc(netloc):

    @tirkarthi
    Copy link
    Member

    I guess moving port validation logic to parsing time is done as part of #16780

    @orsenthil
    Copy link
    Member

    Treating this as bug in itself might be a better idea than waiting for a ipv6 scope introduction, which had few caveats.

    Would it be an improvement if _checknetloc[2] validates the value of port properly?

    Yes, we could check if it is an int. That should be sufficient.

    @orsenthil orsenthil added the 3.10 only security fixes label Apr 16, 2021
    @orsenthil orsenthil self-assigned this Apr 16, 2021
    @orsenthil orsenthil added the 3.10 only security fixes label Apr 16, 2021
    @orsenthil orsenthil self-assigned this Apr 16, 2021
    @p-alik
    Copy link
    Mannequin Author

    p-alik mannequin commented Apr 17, 2021

    Thank you for your swift response and your willingness to add port validation to _checknetloc.

    I think the validation itself should compound both exceptional branches implemented in port[3]

    • port is an int
    • port is in the range

    [3]

    cpython/Lib/urllib/parse.py

    Lines 173 to 178 in e1903e1

    except ValueError:
    message = f'Port could not be cast to integer value as {port!r}'
    raise ValueError(message) from None
    if not ( 0 <= port <= 65535):
    raise ValueError("Port out of range 0-65535")
    return port

    @miguendes
    Copy link
    Mannequin

    miguendes mannequin commented May 1, 2021

    I also think the validation logic should be ran as early as possible.

    I gave it a shot and implemented it.

    I appreciate any reviews: #25774

    Got some ideas from #16780

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @iritkatriel iritkatriel added the stdlib Python modules in the Lib dir label Nov 23, 2023
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants