Skip to content

gh-99726: Add 'fast' argument to os.[l]stat for faster calculation #99727

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions Doc/library/os.path.rst
Original file line number Diff line number Diff line change
Expand Up @@ -437,6 +437,9 @@ the :mod:`glob` module.)
:func:`os.lstat`, or :func:`os.stat`. This function implements the
underlying comparison used by :func:`samefile` and :func:`sameopenfile`.

Do not use stat results created with the *fast* argument, as they may be
missing information necessary to compare the two files.

.. availability:: Unix, Windows.

.. versionchanged:: 3.4
Expand Down
67 changes: 62 additions & 5 deletions Doc/library/os.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1037,17 +1037,27 @@ as internal buffering of data.
.. availability:: Unix.


.. function:: fstat(fd)
.. function:: fstat(fd, *, fast=False)

Get the status of the file descriptor *fd*. Return a :class:`stat_result`
object.

As of Python 3.3, this is equivalent to ``os.stat(fd)``.
As of Python 3.3, this is equivalent to ``os.stat(fd, fast=fast)``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
As of Python 3.3, this is equivalent to ``os.stat(fd, fast=fast)``.
As of Python 3.3, this is equivalent to ``os.stat(fd, fast=False)``.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😆 whoops

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, no, this is correct. There's a fast argument to this function, which gets passed through (it just doesn't happen to make any difference right now, but that's why fast doesn't guarantee it'll leave out any information. Sometimes we have to take the slow path regardless, and that's the case here.)

Copy link
Contributor

@lazka lazka Nov 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, right, I missed that :) Never mind then


Passing *fast* as ``True`` may omit some information on some platforms
for the sake of performance. These omissions are not guaranteed (that is,
the information may be returned anyway), and may change between Python
releases without a deprecation period or due to operating system updates
without warning. See :class:`stat_result` documentation for the fields
that are guaranteed to be present under this option.

.. seealso::

The :func:`.stat` function.

.. versionchanged:: 3.12
Added the *fast* parameter.


.. function:: fstatvfs(fd, /)

Expand Down Expand Up @@ -2175,7 +2185,7 @@ features:
Accepts a :term:`path-like object`.


.. function:: lstat(path, *, dir_fd=None)
.. function:: lstat(path, *, dir_fd=None, fast=False)

Perform the equivalent of an :c:func:`lstat` system call on the given path.
Similar to :func:`~os.stat`, but does not follow symbolic links. Return a
Expand All @@ -2184,8 +2194,15 @@ features:
On platforms that do not support symbolic links, this is an alias for
:func:`~os.stat`.

Passing *fast* as ``True`` may omit some information on some platforms
for the sake of performance. These omissions are not guaranteed (that is,
the information may be returned anyway), and may change between Python
releases without a deprecation period or due to operating system updates
without warning. See :class:`stat_result` documentation for the fields
that are guaranteed to be present under this option.

As of Python 3.3, this is equivalent to ``os.stat(path, dir_fd=dir_fd,
follow_symlinks=False)``.
follow_symlinks=False, fast=fast)``.

This function can also support :ref:`paths relative to directory descriptors
<dir_fd>`.
Expand All @@ -2209,6 +2226,9 @@ features:
Other kinds of reparse points are resolved by the operating system as
for :func:`~os.stat`.

.. versionchanged:: 3.12
Added the *fast* parameter.


.. function:: mkdir(path, mode=0o777, *, dir_fd=None)

Expand Down Expand Up @@ -2781,7 +2801,7 @@ features:
for :class:`bytes` paths on Windows.


.. function:: stat(path, *, dir_fd=None, follow_symlinks=True)
.. function:: stat(path, *, dir_fd=None, follow_symlinks=True, fast=False)

Get the status of a file or a file descriptor. Perform the equivalent of a
:c:func:`stat` system call on the given path. *path* may be specified as
Expand All @@ -2806,6 +2826,13 @@ features:
possible and call :func:`lstat` on the result. This does not apply to
dangling symlinks or junction points, which will raise the usual exceptions.

Passing *fast* as ``True`` may omit some information on some platforms
for the sake of performance. These omissions are not guaranteed (that is,
the information may be returned anyway), and may change between Python
releases without a deprecation period or due to operating system updates
without warning. See :class:`stat_result` documentation for the fields
that are guaranteed to be present under this option.

.. index:: module: stat

Example::
Expand Down Expand Up @@ -2838,19 +2865,32 @@ features:
returns the information for the original path as if
``follow_symlinks=False`` had been specified instead of raising an error.

.. versionchanged:: 3.12
Added the *fast* parameter.


.. class:: stat_result

Object whose attributes correspond roughly to the members of the
:c:type:`stat` structure. It is used for the result of :func:`os.stat`,
:func:`os.fstat` and :func:`os.lstat`.

When the *fast* argument to these functions is passed ``True``, some
information may be reduced or omitted. Those attributes that are
guaranteed to be valid, and those currently known to be omitted, are
marked in the documentation below. If not specified and you depend on
that field, explicitly pass *fast* as ``False`` to ensure it is
calculated.

Attributes:

.. attribute:: st_mode

File mode: file type and file mode bits (permissions).

When *fast* is ``True``, only the file type bits are guaranteed
to be valid (the mode bits may be zero).

.. attribute:: st_ino

Platform dependent, but if non-zero, uniquely identifies the
Expand All @@ -2865,6 +2905,8 @@ features:

Identifier of the device on which this file resides.

On Windows, when *fast* is ``True``, this may be zero.

.. attribute:: st_nlink

Number of hard links.
Expand All @@ -2883,6 +2925,8 @@ features:
The size of a symbolic link is the length of the pathname it contains,
without a terminating null byte.

This field is guaranteed to be filled when specifying *fast*.

Timestamps:

.. attribute:: st_atime
Expand All @@ -2893,6 +2937,8 @@ features:

Time of most recent content modification expressed in seconds.

This field is guaranteed to be filled when specifying *fast*.

.. attribute:: st_ctime

Platform dependent:
Expand All @@ -2909,6 +2955,9 @@ features:
Time of most recent content modification expressed in nanoseconds as an
integer.

This field is guaranteed to be filled when specifying *fast*, subject
to the note below.

.. attribute:: st_ctime_ns

Platform dependent:
Expand Down Expand Up @@ -2998,12 +3047,16 @@ features:
:c:func:`GetFileInformationByHandle`. See the ``FILE_ATTRIBUTE_*``
constants in the :mod:`stat` module.

This field is guaranteed to be filled when specifying *fast*.

.. attribute:: st_reparse_tag

When :attr:`st_file_attributes` has the ``FILE_ATTRIBUTE_REPARSE_POINT``
set, this field contains the tag identifying the type of reparse point.
See the ``IO_REPARSE_TAG_*`` constants in the :mod:`stat` module.

This field is guaranteed to be filled when specifying *fast*.

The standard module :mod:`stat` defines functions and constants that are
useful for extracting information from a :c:type:`stat` structure. (On
Windows, some items are filled with dummy values.)
Expand Down Expand Up @@ -3039,6 +3092,10 @@ features:
files as :const:`S_IFCHR`, :const:`S_IFIFO` or :const:`S_IFBLK`
as appropriate.

.. versionchanged:: 3.12
Added the *fast* argument and defined the minimum set of returned
fields.

.. function:: statvfs(path)

Perform a :c:func:`statvfs` system call on the given path. The return value is
Expand Down
77 changes: 77 additions & 0 deletions Include/internal/pycore_fileutils_windows.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
#ifndef Py_INTERNAL_FILEUTILS_WINDOWS_H
#define Py_INTERNAL_FILEUTILS_WINDOWS_H
#ifdef __cplusplus
extern "C" {
#endif

#ifndef Py_BUILD_CORE
# error "Py_BUILD_CORE must be defined to include this header"
#endif

#ifdef MS_WINDOWS

#if !defined(NTDDI_WIN10_NI) || !(NTDDI_VERSION >= NTDDI_WIN10_NI)
typedef struct _FILE_STAT_BASIC_INFORMATION {
LARGE_INTEGER FileId;
LARGE_INTEGER CreationTime;
LARGE_INTEGER LastAccessTime;
LARGE_INTEGER LastWriteTime;
LARGE_INTEGER ChangeTime;
LARGE_INTEGER AllocationSize;
LARGE_INTEGER EndOfFile;
ULONG FileAttributes;
ULONG ReparseTag;
ULONG NumberOfLinks;
ULONG DeviceType;
ULONG DeviceCharacteristics;
} FILE_STAT_BASIC_INFORMATION;

typedef enum _FILE_INFO_BY_NAME_CLASS {
FileStatByNameInfo,
FileStatLxByNameInfo,
FileCaseSensitiveByNameInfo,
FileStatBasicByNameInfo,
MaximumFileInfoByNameClass
} FILE_INFO_BY_NAME_CLASS;
#endif

typedef BOOL (WINAPI *PGetFileInformationByName)(
PCWSTR FileName,
FILE_INFO_BY_NAME_CLASS FileInformationClass,
PVOID FileInfoBuffer,
ULONG FileInfoBufferSize
);

static inline BOOL GetFileInformationByName(
PCWSTR FileName,
FILE_INFO_BY_NAME_CLASS FileInformationClass,
PVOID FileInfoBuffer,
ULONG FileInfoBufferSize
) {
static PGetFileInformationByName GetFileInformationByName = NULL;
static int GetFileInformationByName_init = -1;

if (GetFileInformationByName_init < 0) {
HMODULE hMod = LoadLibraryW(L"api-ms-win-core-file-l2-1-4");
GetFileInformationByName_init = 0;
if (hMod) {
GetFileInformationByName = (PGetFileInformationByName)GetProcAddress(
hMod, "GetFileInformationByName");
if (GetFileInformationByName) {
GetFileInformationByName_init = 1;
} else {
FreeLibrary(hMod);
}
}
}

if (GetFileInformationByName_init <= 0) {
SetLastError(ERROR_NOT_SUPPORTED);
return FALSE;
}
return GetFileInformationByName(FileName, FileInformationClass, FileInfoBuffer, FileInfoBufferSize);
}

#endif

#endif
1 change: 1 addition & 0 deletions Include/internal/pycore_global_objects_fini_generated.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Include/internal/pycore_global_strings.h
Original file line number Diff line number Diff line change
Expand Up @@ -389,6 +389,7 @@ struct _Py_global_strings {
STRUCT_FOR_ID(false)
STRUCT_FOR_ID(family)
STRUCT_FOR_ID(fanout)
STRUCT_FOR_ID(fast)
STRUCT_FOR_ID(fd)
STRUCT_FOR_ID(fd2)
STRUCT_FOR_ID(fdel)
Expand Down
1 change: 1 addition & 0 deletions Include/internal/pycore_runtime_init_generated.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions Include/internal/pycore_unicodeobject_generated.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Lib/asyncio/proactor_events.py
Original file line number Diff line number Diff line change
Expand Up @@ -734,7 +734,7 @@ async def _sock_sendfile_native(self, sock, file, offset, count):
except (AttributeError, io.UnsupportedOperation) as err:
raise exceptions.SendfileNotAvailableError("not a regular file")
try:
fsize = os.fstat(fileno).st_size
fsize = os.fstat(fileno, fast=True).st_size
except OSError:
raise exceptions.SendfileNotAvailableError("not a regular file")
blocksize = count if count else fsize
Expand Down
8 changes: 4 additions & 4 deletions Lib/asyncio/unix_events.py
Original file line number Diff line number Diff line change
Expand Up @@ -307,7 +307,7 @@ async def create_unix_server(
# Check for abstract socket. `str` and `bytes` paths are supported.
if path[0] not in (0, '\x00'):
try:
if stat.S_ISSOCK(os.stat(path).st_mode):
if stat.S_ISSOCK(os.stat(path, fast=True).st_mode):
os.remove(path)
except FileNotFoundError:
pass
Expand Down Expand Up @@ -363,7 +363,7 @@ async def _sock_sendfile_native(self, sock, file, offset, count):
except (AttributeError, io.UnsupportedOperation) as err:
raise exceptions.SendfileNotAvailableError("not a regular file")
try:
fsize = os.fstat(fileno).st_size
fsize = os.fstat(fileno, fast=True).st_size
except OSError:
raise exceptions.SendfileNotAvailableError("not a regular file")
blocksize = count if count else fsize
Expand Down Expand Up @@ -472,7 +472,7 @@ def __init__(self, loop, pipe, protocol, waiter=None, extra=None):
self._closing = False
self._paused = False

mode = os.fstat(self._fileno).st_mode
mode = os.fstat(self._fileno, fast=True).st_mode
if not (stat.S_ISFIFO(mode) or
stat.S_ISSOCK(mode) or
stat.S_ISCHR(mode)):
Expand Down Expand Up @@ -607,7 +607,7 @@ def __init__(self, loop, pipe, protocol, waiter=None, extra=None):
self._conn_lost = 0
self._closing = False # Set when close() or write_eof() called.

mode = os.fstat(self._fileno).st_mode
mode = os.fstat(self._fileno, fast=True).st_mode
is_char = stat.S_ISCHR(mode)
is_fifo = stat.S_ISFIFO(mode)
is_socket = stat.S_ISSOCK(mode)
Expand Down
2 changes: 1 addition & 1 deletion Lib/compileall.py
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,7 @@ def compile_file(fullname, ddir=None, force=False, rx=None, quiet=0,
if tail == '.py':
if not force:
try:
mtime = int(os.stat(fullname).st_mtime)
mtime = int(os.stat(fullname, fast=True).st_mtime)
expect = struct.pack('<4sLL', importlib.util.MAGIC_NUMBER,
0, mtime & 0xFFFF_FFFF)
for cfile in opt_cfiles.values():
Expand Down
8 changes: 4 additions & 4 deletions Lib/filecmp.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,8 @@ def cmp(f1, f2, shallow=True):

"""

s1 = _sig(os.stat(f1))
s2 = _sig(os.stat(f2))
s1 = _sig(os.stat(f1, fast=True))
s2 = _sig(os.stat(f2, fast=True))
if s1[0] != stat.S_IFREG or s2[0] != stat.S_IFREG:
return False
if shallow and s1 == s2:
Expand Down Expand Up @@ -159,12 +159,12 @@ def phase2(self): # Distinguish files, directories, funnies

ok = True
try:
a_stat = os.stat(a_path)
a_stat = os.stat(a_path, fast=True)
except OSError:
# print('Can\'t stat', a_path, ':', why.args[1])
ok = False
try:
b_stat = os.stat(b_path)
b_stat = os.stat(b_path, fast=True)
except OSError:
# print('Can\'t stat', b_path, ':', why.args[1])
ok = False
Expand Down
Loading