Skip to content

os.path.realpath raises a ValueError when path contains a null byte #117596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
stefanhoelzl opened this issue Apr 7, 2024 · 6 comments
Open
Labels
type-bug An unexpected behavior, bug, or error

Comments

@stefanhoelzl
Copy link
Contributor

stefanhoelzl commented Apr 7, 2024

Bug report

Bug description:

import os
os.path.realpath("\x00")  # raises unexpected ValueError

The ValueError is initially raised by os.lstat.

This behavior is inconsistent with other path operations catching the ValueError raised by os.lstat:
os.path.lexsists: https://github.com/python/cpython/blob/main/Lib/genericpath.py#L30
os.path.exists: https://github.com/python/cpython/blob/main/Lib/genericpath.py#L20
os.path.islink: https://github.com/python/cpython/blob/main/Lib/genericpath.py#L64
os.path.ismount: https://github.com/python/cpython/blob/main/Lib/posixpath.py#L197
os.path.ismount: https://github.com/python/cpython/blob/main/Lib/posixpath.py#L213

CPython versions tested on:

3.10, CPython main branch

Operating systems tested on:

Linux

Linked PRs

@stefanhoelzl stefanhoelzl added the type-bug An unexpected behavior, bug, or error label Apr 7, 2024
@stefanhoelzl
Copy link
Contributor Author

I already created a PR #117573 for this issue

@serhiy-storchaka
Copy link
Member

I do not think there is a bug. Functions like os.path.exists() answer the question "is there a node (of the specified type) with such path?" Since the valid path cannot contain the null character, the answer is "no". You get the same answer for a path containing non-encodable characters. os.path.realpath() answers the different question -- "what is the real path of the specified node?"

os.path.realpath() ignores some OSError in non-strict mode. This allows it to process paths to non-existent nodes. These errors can be fixed by making a change to the file system. But no matter what you do with the file system, an invalid path (a path containing the null characters or non-encodable characters) will remain invalid.

@stefanhoelzl
Copy link
Contributor Author

In case of os.path.exists I can agree.

I guess my selection of examples was not very good.
os.path.basename, os.path.dirname, os.fspath, os.path.abspath, os.path.relpath ... (list is not complete) also do not raise a ValueError when given a path with a null byte.
All the methods do not raise a ValueError when given a path with a null byte, they just accept it as part of the path.

I would except consistency across all path operations, but I found only os.path.realpath with another behaviour. So I assumed this is the one with the bug.

Python 3.13.0a5+ (heads/fix-realpath-value-error-dirty:8da4ebb996, Apr  7 2024, 17:05:12) [GCC 13.2.1 20230918 (Red Hat 13.2.1-3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.fspath("\x00")
'\x00'
>>> os.path.abspath("\x00")
'/var/home/stefan/Development/repos/cpython/\x00'
>>> os.path.basename("\x00")
'\x00'
>>> os.path.dirname("\x00")
''
>>> os.path.relpath("\x00")
'\x00'
>>> os.path.normpath("\x00")
'\x00'
>>> 

@serhiy-storchaka
Copy link
Member

os.path.dirname and os.path.relpath do not work with filesystem. They could check for the null character etc, but nobody is interesting in making them intentionally slower. In functions which interact with OS, it happens as the part of conversion to OS-specific representation.

@encukou
Copy link
Member

encukou commented Apr 8, 2024

Serhiy's model sounds good to me. To summarize:

  • exists and is* functions should return False if the path contains NUL
  • functions that deal with the OS should raise ValueError (before handing the path to an OS call)
  • functions that manipulate the path should either raise ValueError (if they can detect it easily), or treat NUL as a normal character (the same way as an invalid character like | is handled in ntpath.dirname)

But, it looks like that's not what we do on Windows.

@serhiy-storchaka
Copy link
Member

I added some tests for realpath() with invalid paths in PR #134147. Then I decided that this issue was a better fit and separated the tests into a separate PR #134189. And added more tests.

Now, whatever we decide, there are already tests that will show changes in behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants