Skip to content

time.perf_counter(): compute Greatest Common Denonimator (GCD) to reduce risk of integer overflow #112567

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vstinner opened this issue Dec 1, 2023 · 5 comments

Comments

@vstinner
Copy link
Member

vstinner commented Dec 1, 2023

The time.perf_counter() function is implemented by calling QueryPerformanceCounter() and QueryPerformanceFrequency(). It computes QueryPerformanceCounter() * SEC_TO_SEC / QueryPerformanceFrequency() using int64_t integers. The problem is that SEC_TO_SEC is big: 10^9.

QueryPerformanceFrequency() usually returns 10 MHz on Windows 10 and newer. The fraction SEC_TO_NS / frequency = 1_000_000_000 / 10_000_000 can be simplified to 100 / 1.

I propose using a fraction internally to convert QueryPerformanceCounter() value to a number of seconds, and simplify the fraction using the Greatest Common Denominator (GCD).

There are multiple functions using a fraction:

  • _PyTime_GetClockWithInfo() for clock() -- time.process_time() in Python
  • _PyTime_GetProcessTimeWithInfo() for times() -- time.process_time() in Python
  • py_get_monotonic_clock() for mach_absolute_time() on macOS -- time.monotonic() in Python
  • py_get_win_perf_counter() for QueryPerformanceCounter() on Windows -- time.perf_counter() in Python

Linked PRs

vstinner added a commit to vstinner/cpython that referenced this issue Dec 1, 2023
Use a fraction internally in the _PyTime API to reduce the risk of
integer overflow: simplify the fraction using Greatest Common
Denominator (GCD). The fraction API is used by time functions:
perf_counter(), monotonic() and process_time().

For example, QueryPerformanceFrequency() usually returns 10 MHz on
Windows 10 and newer. The fraction SEC_TO_NS / frequency =
1_000_000_000 / 10_000_000 can be simplified to 100 / 1.

* Add _PyTimeFraction type.
* Add functions:

  * _Py_GetTicksPerSecond()
  * _PyTimeFraction_Set()
  * _PyTimeFraction_Mul()
  * _PyTimeFraction_Resolution()

* Remove _PyTime_Init().
* Remove _PyRuntimeState.time member.
* os.times() now calls _Py_GetTicksPerSecond().
* Rename _PyTime_GetClockWithInfo() to py_clock().
vstinner added a commit to vstinner/cpython that referenced this issue Dec 1, 2023
Use a fraction internally in the _PyTime API to reduce the risk of
integer overflow: simplify the fraction using Greatest Common
Denominator (GCD). The fraction API is used by time functions:
perf_counter(), monotonic() and process_time().

For example, QueryPerformanceFrequency() usually returns 10 MHz on
Windows 10 and newer. The fraction SEC_TO_NS / frequency =
1_000_000_000 / 10_000_000 can be simplified to 100 / 1.

* Add _PyTimeFraction type.
* Add functions:

  * _Py_GetTicksPerSecond()
  * _PyTimeFraction_Set()
  * _PyTimeFraction_Mul()
  * _PyTimeFraction_Resolution()

* Remove _PyTime_Init().
* Remove _PyRuntimeState.time member.
* os.times() now calls _Py_GetTicksPerSecond().
* Rename _PyTime_GetClockWithInfo() to py_clock().
@serhiy-storchaka
Copy link
Member

QueryPerformanceCounter() * SEC_TO_SEC / QueryPerformanceFrequency() can be calculated as

QueryPerformanceCounter() * (SEC_TO_SEC / QueryPerformanceFrequency()) +
QueryPerformanceCounter() * (SEC_TO_SEC % QueryPerformanceFrequency()) / QueryPerformanceFrequency()

or

SEC_TO_SEC * (QueryPerformanceCounter() / QueryPerformanceFrequency()) +
SEC_TO_SEC * (QueryPerformanceCounter() % QueryPerformanceFrequency()) / QueryPerformanceFrequency()

or

QueryPerformanceCounter() * (SEC_TO_SEC / QueryPerformanceFrequency()) +
(SEC_TO_SEC % QueryPerformanceFrequency()) * (QueryPerformanceCounter() / QueryPerformanceFrequency()) +
(SEC_TO_SEC % QueryPerformanceFrequency()) * (QueryPerformanceCounter() % QueryPerformanceFrequency()) / QueryPerformanceFrequency()

or

SEC_TO_SEC * (QueryPerformanceCounter() / QueryPerformanceFrequency()) +
(QueryPerformanceCounter() % QueryPerformanceFrequency()) * (SEC_TO_SEC / QueryPerformanceFrequency()) +
(QueryPerformanceCounter() % QueryPerformanceFrequency()) * (SEC_TO_SEC % QueryPerformanceFrequency()) / QueryPerformanceFrequency()

@vstinner
Copy link
Member Author

vstinner commented Dec 1, 2023

QueryPerformanceCounter() * SEC_TO_SEC / QueryPerformanceFrequency() can be calculated as (...)

_PyTimeFraction_Mul(ticks, frac) is not a naive ticks * frac.numer / frac.denom operation, but already uses modulo to reduce the risk of integer overflow. I wrote an article on that :-) https://vstinner.github.io/python37-perf-counter-nanoseconds.html

But still, we can reduce the risk of integer overflow even further since in most cases, the numerator can be made way smaller in most cases. With QueryPerformanceFrequency(), the numerator is reduced from 10^9 to 100, since the common frequency is 10 MHz.

Since the frequency is only read once, IMO it's worth it to quickly compute a GCD.

vstinner added a commit to vstinner/cpython that referenced this issue Dec 1, 2023
* Add process_time_times() helper function, called by
  _PyTime_GetProcessTimeWithInfo().
* Rename _PyTime_GetClockWithInfo() to py_clock().
* Remove _PyRuntimeState.time.
vstinner added a commit to vstinner/cpython that referenced this issue Dec 1, 2023
Avoid sysconf("_SC_CLK_TCK") call at Python startup. Only call it
once at the first os.times() or time.process_time() call.

* Add process_time_times() helper function, called by
  _PyTime_GetProcessTimeWithInfo().
* Rename _PyTime_GetClockWithInfo() to py_clock().
* Remove _PyRuntimeState.time.
@serhiy-storchaka
Copy link
Member

And what if it is not reduced on some weird configuration?

vstinner added a commit to vstinner/cpython that referenced this issue Dec 1, 2023
* Move _PyRuntimeState.time to _posixstate.ticks_per_second and
  time_module_state.ticks_per_second.
* Add time_module_state.clocks_per_second.
* Add process_time_times() helper function, called by
  _PyTime_GetProcessTimeWithInfo().
* Rename _PyTime_GetClockWithInfo() to py_clock().
* Rename _PyTime_GetProcessTimeWithInfo() to py_process_time().
vstinner added a commit to vstinner/cpython that referenced this issue Dec 1, 2023
* Move _PyRuntimeState.time to _posixstate.ticks_per_second and
  time_module_state.ticks_per_second.
* Add time_module_state.clocks_per_second.
* Add process_time_times() helper function, called by
  _PyTime_GetProcessTimeWithInfo().
* Rename _PyTime_GetClockWithInfo() to py_clock().
* Rename _PyTime_GetProcessTimeWithInfo() to py_process_time().
* os.times() is now always built: no longer rely on HAVE_TIMES.
@vstinner
Copy link
Member Author

vstinner commented Dec 1, 2023

And what if it is not reduced on some weird configuration?

I only propose to make the current situation on known configuration better. If some platforms require very large numerator and denominator, later, we can consider to switch to int128_t for intermediate results in _PyTimeFraction_Mul(). For now, I don't think that it's needed.

By the way, _PyTime_Mul() clamps the result to [_PyTime_MIN; _PyTime_MAX] range on overflow.

vstinner added a commit to vstinner/cpython that referenced this issue Dec 1, 2023
* Move _PyRuntimeState.time to _posixstate.ticks_per_second and
  time_module_state.ticks_per_second.
* Add time_module_state.clocks_per_second.
* Add process_time_times() helper function, called by
  _PyTime_GetProcessTimeWithInfo().
* Rename _PyTime_GetClockWithInfo() to py_clock().
* Rename _PyTime_GetProcessTimeWithInfo() to py_process_time().
* os.times() is now always built: no longer rely on HAVE_TIMES.
vstinner added a commit that referenced this issue Dec 1, 2023
* Move _PyRuntimeState.time to _posixstate.ticks_per_second and
  time_module_state.ticks_per_second.
* Add time_module_state.clocks_per_second.
* Rename _PyTime_GetClockWithInfo() to py_clock().
* Rename _PyTime_GetProcessTimeWithInfo() to py_process_time().
* Add process_time_times() helper function, called by
  py_process_time().
* os.times() is now always built: no longer rely on HAVE_TIMES.
vstinner added a commit to vstinner/cpython that referenced this issue Dec 1, 2023
Use a fraction internally in the _PyTime API to reduce the risk of
integer overflow: simplify the fraction using Greatest Common
Denominator (GCD). The fraction API is used by time functions:
perf_counter(), monotonic() and process_time().

For example, QueryPerformanceFrequency() usually returns 10 MHz on
Windows 10 and newer. The fraction SEC_TO_NS / frequency =
1_000_000_000 / 10_000_000 can be simplified to 100 / 1.

* Add _PyTimeFraction type.
* Add functions:

  * _PyTimeFraction_Set()
  * _PyTimeFraction_Mul()
  * _PyTimeFraction_Resolution()
vstinner added a commit to vstinner/cpython that referenced this issue Dec 1, 2023
Use a fraction internally in the _PyTime API to reduce the risk of
integer overflow: simplify the fraction using Greatest Common
Denominator (GCD). The fraction API is used by time functions:
perf_counter(), monotonic() and process_time().

For example, QueryPerformanceFrequency() usually returns 10 MHz on
Windows 10 and newer. The fraction SEC_TO_NS / frequency =
1_000_000_000 / 10_000_000 can be simplified to 100 / 1.

* Add _PyTimeFraction type.
* Add functions:

  * _PyTimeFraction_Set()
  * _PyTimeFraction_Mul()
  * _PyTimeFraction_Resolution()

* No longer check "numer * denom <= _PyTime_MAX" in
  _PyTimeFraction_Set(). _PyTimeFraction_Mul() uses _PyTime_Mul()
  which handles integer overflow.
vstinner added a commit to vstinner/cpython that referenced this issue Dec 1, 2023
Use a fraction internally in the _PyTime API to reduce the risk of
integer overflow: simplify the fraction using Greatest Common
Denominator (GCD). The fraction API is used by time functions:
perf_counter(), monotonic() and process_time().

For example, QueryPerformanceFrequency() usually returns 10 MHz on
Windows 10 and newer. The fraction SEC_TO_NS / frequency =
1_000_000_000 / 10_000_000 can be simplified to 100 / 1.

* Add _PyTimeFraction type.
* Add functions:

  * _PyTimeFraction_Set()
  * _PyTimeFraction_Mul()
  * _PyTimeFraction_Resolution()

* No longer check "numer * denom <= _PyTime_MAX" in
  _PyTimeFraction_Set(). _PyTimeFraction_Mul() uses _PyTime_Mul()
  which handles integer overflow.
vstinner added a commit to vstinner/cpython that referenced this issue Dec 1, 2023
Use a fraction internally in the _PyTime API to reduce the risk of
integer overflow: simplify the fraction using Greatest Common
Denominator (GCD). The fraction API is used by time functions:
perf_counter(), monotonic() and process_time().

For example, QueryPerformanceFrequency() usually returns 10 MHz on
Windows 10 and newer. The fraction SEC_TO_NS / frequency =
1_000_000_000 / 10_000_000 can be simplified to 100 / 1.

* Add _PyTimeFraction type.
* Add functions:

  * _PyTimeFraction_Set()
  * _PyTimeFraction_Mul()
  * _PyTimeFraction_Resolution()

* No longer check "numer * denom <= _PyTime_MAX" in
  _PyTimeFraction_Set(). _PyTimeFraction_Mul() uses _PyTime_Mul()
  which handles integer overflow.
vstinner added a commit that referenced this issue Dec 1, 2023
Use a fraction internally in the _PyTime API to reduce the risk of
integer overflow: simplify the fraction using Greatest Common
Divisor (GCD). The fraction API is used by time functions:
perf_counter(), monotonic() and process_time().

For example, QueryPerformanceFrequency() usually returns 10 MHz on
Windows 10 and newer. The fraction SEC_TO_NS / frequency =
1_000_000_000 / 10_000_000 can be simplified to 100 / 1.

* Add _PyTimeFraction type.
* Add functions:

  * _PyTimeFraction_Set()
  * _PyTimeFraction_Mul()
  * _PyTimeFraction_Resolution()

* No longer check "numer * denom <= _PyTime_MAX" in
  _PyTimeFraction_Set(). _PyTimeFraction_Mul() uses _PyTime_Mul()
  which handles integer overflow.
@vstinner
Copy link
Member Author

vstinner commented Dec 1, 2023

I merged my changes. @serhiy-storchaka: If you see room for improvement to reduce even more the risk of integer overflow in _PyTimeFraction_Mul(), go ahead ;-)

@vstinner vstinner closed this as completed Dec 1, 2023
aisk pushed a commit to aisk/cpython that referenced this issue Feb 11, 2024
* Move _PyRuntimeState.time to _posixstate.ticks_per_second and
  time_module_state.ticks_per_second.
* Add time_module_state.clocks_per_second.
* Rename _PyTime_GetClockWithInfo() to py_clock().
* Rename _PyTime_GetProcessTimeWithInfo() to py_process_time().
* Add process_time_times() helper function, called by
  py_process_time().
* os.times() is now always built: no longer rely on HAVE_TIMES.
aisk pushed a commit to aisk/cpython that referenced this issue Feb 11, 2024
Use a fraction internally in the _PyTime API to reduce the risk of
integer overflow: simplify the fraction using Greatest Common
Divisor (GCD). The fraction API is used by time functions:
perf_counter(), monotonic() and process_time().

For example, QueryPerformanceFrequency() usually returns 10 MHz on
Windows 10 and newer. The fraction SEC_TO_NS / frequency =
1_000_000_000 / 10_000_000 can be simplified to 100 / 1.

* Add _PyTimeFraction type.
* Add functions:

  * _PyTimeFraction_Set()
  * _PyTimeFraction_Mul()
  * _PyTimeFraction_Resolution()

* No longer check "numer * denom <= _PyTime_MAX" in
  _PyTimeFraction_Set(). _PyTimeFraction_Mul() uses _PyTime_Mul()
  which handles integer overflow.
Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024
* Move _PyRuntimeState.time to _posixstate.ticks_per_second and
  time_module_state.ticks_per_second.
* Add time_module_state.clocks_per_second.
* Rename _PyTime_GetClockWithInfo() to py_clock().
* Rename _PyTime_GetProcessTimeWithInfo() to py_process_time().
* Add process_time_times() helper function, called by
  py_process_time().
* os.times() is now always built: no longer rely on HAVE_TIMES.
Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024
Use a fraction internally in the _PyTime API to reduce the risk of
integer overflow: simplify the fraction using Greatest Common
Divisor (GCD). The fraction API is used by time functions:
perf_counter(), monotonic() and process_time().

For example, QueryPerformanceFrequency() usually returns 10 MHz on
Windows 10 and newer. The fraction SEC_TO_NS / frequency =
1_000_000_000 / 10_000_000 can be simplified to 100 / 1.

* Add _PyTimeFraction type.
* Add functions:

  * _PyTimeFraction_Set()
  * _PyTimeFraction_Mul()
  * _PyTimeFraction_Resolution()

* No longer check "numer * denom <= _PyTime_MAX" in
  _PyTimeFraction_Set(). _PyTimeFraction_Mul() uses _PyTime_Mul()
  which handles integer overflow.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants