Skip to content

Commit 0622f3f

Browse files
picnixzhugovkvstinnermerwok
authored andcommitted
pythongh-89083: add support for UUID version 7 (RFC 9562) (python#121119)
Add support for generating UUIDv7 objects according to RFC 9562, §5.7 [1]. The functionality is provided by the `uuid.uuid7()` function. The implementation is based on a 42-bit counter as described by Method 1, §6.2 [2] and guarantees monotonicity within the same millisecond. [1]: https://www.rfc-editor.org/rfc/rfc9562.html#section-5.7 [2]: https://www.rfc-editor.org/rfc/rfc9562.html#section-6.2 --------- Co-authored-by: Hugo van Kemenade <[email protected]> Co-authored-by: Victor Stinner <[email protected]> Co-authored-by: Éric <[email protected]>
1 parent 04bbc34 commit 0622f3f

File tree

5 files changed

+315
-12
lines changed

5 files changed

+315
-12
lines changed

Doc/library/uuid.rst

Lines changed: 21 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,10 @@
1111
--------------
1212

1313
This module provides immutable :class:`UUID` objects (the :class:`UUID` class)
14-
and the functions :func:`uuid1`, :func:`uuid3`, :func:`uuid4`, :func:`uuid5`,
15-
:func:`uuid6`, and :func:`uuid8` for generating version 1, 3, 4, 5, 6,
16-
and 8 UUIDs as specified in :rfc:`9562` (which supersedes :rfc:`4122`).
14+
and :ref:`functions <uuid-factory-functions>` for generating UUIDs corresponding
15+
to a specific UUID version as specified in :rfc:`9562` (which supersedes :rfc:`4122`),
16+
for example, :func:`uuid1` for UUID version 1, :func:`uuid3` for UUID version 3, and so on.
17+
Note that UUID version 2 is deliberately omitted as it is outside the scope of the RFC.
1718

1819
If all you want is a unique ID, you should probably call :func:`uuid1` or
1920
:func:`uuid4`. Note that :func:`uuid1` may compromise privacy since it creates
@@ -154,7 +155,7 @@ which relays any information about the UUID's safety, using this enumeration:
154155
:const:`RFC_4122`).
155156

156157
.. versionchanged:: next
157-
Added UUID versions 6 and 8.
158+
Added UUID versions 6, 7 and 8.
158159

159160

160161
.. attribute:: UUID.is_safe
@@ -185,6 +186,8 @@ The :mod:`uuid` module defines the following functions:
185186
globally unique, while the latter are not.
186187

187188

189+
.. _uuid-factory-functions:
190+
188191
.. function:: uuid1(node=None, clock_seq=None)
189192

190193
Generate a UUID from a host ID, sequence number, and the current time. If *node*
@@ -228,6 +231,18 @@ The :mod:`uuid` module defines the following functions:
228231
.. versionadded:: next
229232

230233

234+
.. function:: uuid7()
235+
236+
Generate a time-based UUID according to
237+
:rfc:`RFC 9562, §5.7 <9562#section-5.7>`.
238+
239+
For portability across platforms lacking sub-millisecond precision, UUIDs
240+
produced by this function embed a 48-bit timestamp and use a 42-bit counter
241+
to guarantee monotonicity within a millisecond.
242+
243+
.. versionadded:: next
244+
245+
231246
.. function:: uuid8(a=None, b=None, c=None)
232247

233248
Generate a pseudo-random UUID according to
@@ -330,7 +345,7 @@ The :mod:`uuid` module can be executed as a script from the command line.
330345

331346
.. code-block:: sh
332347
333-
python -m uuid [-h] [-u {uuid1,uuid3,uuid4,uuid5,uuid6,uuid8}] [-n NAMESPACE] [-N NAME]
348+
python -m uuid [-h] [-u {uuid1,uuid3,uuid4,uuid5,uuid6,uuid7,uuid8}] [-n NAMESPACE] [-N NAME]
334349
335350
The following options are accepted:
336351

@@ -347,7 +362,7 @@ The following options are accepted:
347362
is used.
348363

349364
.. versionchanged:: next
350-
Allow generating UUID versions 6 and 8.
365+
Allow generating UUID versions 6, 7 and 8.
351366

352367
.. option:: -n <namespace>
353368
--namespace <namespace>

Doc/whatsnew/3.14.rst

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -924,8 +924,9 @@ urllib
924924
uuid
925925
----
926926

927-
* Add support for UUID versions 6 and 8 via :func:`uuid.uuid6` and
928-
:func:`uuid.uuid8` respectively, as specified in :rfc:`9562`.
927+
* Add support for UUID versions 6, 7, and 8 via :func:`uuid.uuid6`,
928+
:func:`uuid.uuid7`, and :func:`uuid.uuid8` respectively, as specified
929+
in :rfc:`9562`.
929930
(Contributed by Bénédikt Tran in :gh:`89083`.)
930931

931932
* :const:`uuid.NIL` and :const:`uuid.MAX` are now available to represent the

Lib/test/test_uuid.py

Lines changed: 200 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -871,6 +871,206 @@ def test_uuid6_test_vectors(self):
871871
equal((u.int >> 80) & 0xffff, 0x232a)
872872
equal((u.int >> 96) & 0xffff_ffff, 0x1ec9_414c)
873873

874+
def test_uuid7(self):
875+
equal = self.assertEqual
876+
u = self.uuid.uuid7()
877+
equal(u.variant, self.uuid.RFC_4122)
878+
equal(u.version, 7)
879+
880+
# 1 Jan 2023 12:34:56.123_456_789
881+
timestamp_ns = 1672533296_123_456_789 # ns precision
882+
timestamp_ms, _ = divmod(timestamp_ns, 1_000_000)
883+
884+
for _ in range(100):
885+
counter_hi = random.getrandbits(11)
886+
counter_lo = random.getrandbits(30)
887+
counter = (counter_hi << 30) | counter_lo
888+
889+
tail = random.getrandbits(32)
890+
# effective number of bits is 32 + 30 + 11 = 73
891+
random_bits = counter << 32 | tail
892+
893+
# set all remaining MSB of fake random bits to 1 to ensure that
894+
# the implementation correctly removes them
895+
random_bits = (((1 << 7) - 1) << 73) | random_bits
896+
random_data = random_bits.to_bytes(10)
897+
898+
with (
899+
mock.patch.multiple(
900+
self.uuid,
901+
_last_timestamp_v7=None,
902+
_last_counter_v7=0,
903+
),
904+
mock.patch('time.time_ns', return_value=timestamp_ns),
905+
mock.patch('os.urandom', return_value=random_data) as urand
906+
):
907+
u = self.uuid.uuid7()
908+
urand.assert_called_once_with(10)
909+
equal(u.variant, self.uuid.RFC_4122)
910+
equal(u.version, 7)
911+
912+
equal(self.uuid._last_timestamp_v7, timestamp_ms)
913+
equal(self.uuid._last_counter_v7, counter)
914+
915+
unix_ts_ms = timestamp_ms & 0xffff_ffff_ffff
916+
equal((u.int >> 80) & 0xffff_ffff_ffff, unix_ts_ms)
917+
918+
equal((u.int >> 75) & 1, 0) # check that the MSB is 0
919+
equal((u.int >> 64) & 0xfff, counter_hi)
920+
equal((u.int >> 32) & 0x3fff_ffff, counter_lo)
921+
equal(u.int & 0xffff_ffff, tail)
922+
923+
def test_uuid7_uniqueness(self):
924+
# Test that UUIDv7-generated values are unique.
925+
#
926+
# While UUIDv8 has an entropy of 122 bits, those 122 bits may not
927+
# necessarily be sampled from a PRNG. On the other hand, UUIDv7
928+
# uses os.urandom() as a PRNG which features better randomness.
929+
N = 1000
930+
uuids = {self.uuid.uuid7() for _ in range(N)}
931+
self.assertEqual(len(uuids), N)
932+
933+
versions = {u.version for u in uuids}
934+
self.assertSetEqual(versions, {7})
935+
936+
def test_uuid7_monotonicity(self):
937+
equal = self.assertEqual
938+
939+
us = [self.uuid.uuid7() for _ in range(10_000)]
940+
equal(us, sorted(us))
941+
942+
with mock.patch.multiple(
943+
self.uuid,
944+
_last_timestamp_v7=0,
945+
_last_counter_v7=0,
946+
):
947+
# 1 Jan 2023 12:34:56.123_456_789
948+
timestamp_ns = 1672533296_123_456_789 # ns precision
949+
timestamp_ms, _ = divmod(timestamp_ns, 1_000_000)
950+
951+
# counter_{hi,lo} are chosen so that "counter + 1" does not overflow
952+
counter_hi = random.getrandbits(11)
953+
counter_lo = random.getrandbits(29)
954+
counter = (counter_hi << 30) | counter_lo
955+
self.assertLess(counter + 1, 0x3ff_ffff_ffff)
956+
957+
tail = random.getrandbits(32)
958+
random_bits = counter << 32 | tail
959+
random_data = random_bits.to_bytes(10)
960+
961+
with (
962+
mock.patch('time.time_ns', return_value=timestamp_ns),
963+
mock.patch('os.urandom', return_value=random_data) as urand
964+
):
965+
u1 = self.uuid.uuid7()
966+
urand.assert_called_once_with(10)
967+
equal(self.uuid._last_timestamp_v7, timestamp_ms)
968+
equal(self.uuid._last_counter_v7, counter)
969+
equal((u1.int >> 64) & 0xfff, counter_hi)
970+
equal((u1.int >> 32) & 0x3fff_ffff, counter_lo)
971+
equal(u1.int & 0xffff_ffff, tail)
972+
973+
# 1 Jan 2023 12:34:56.123_457_032 (same millisecond but not same ns)
974+
next_timestamp_ns = 1672533296_123_457_032
975+
next_timestamp_ms, _ = divmod(timestamp_ns, 1_000_000)
976+
equal(timestamp_ms, next_timestamp_ms)
977+
978+
next_tail_bytes = os.urandom(4)
979+
next_fail = int.from_bytes(next_tail_bytes)
980+
981+
with (
982+
mock.patch('time.time_ns', return_value=next_timestamp_ns),
983+
mock.patch('os.urandom', return_value=next_tail_bytes) as urand
984+
):
985+
u2 = self.uuid.uuid7()
986+
urand.assert_called_once_with(4)
987+
# same milli-second
988+
equal(self.uuid._last_timestamp_v7, timestamp_ms)
989+
# 42-bit counter advanced by 1
990+
equal(self.uuid._last_counter_v7, counter + 1)
991+
equal((u2.int >> 64) & 0xfff, counter_hi)
992+
equal((u2.int >> 32) & 0x3fff_ffff, counter_lo + 1)
993+
equal(u2.int & 0xffff_ffff, next_fail)
994+
995+
self.assertLess(u1, u2)
996+
997+
def test_uuid7_timestamp_backwards(self):
998+
equal = self.assertEqual
999+
# 1 Jan 2023 12:34:56.123_456_789
1000+
timestamp_ns = 1672533296_123_456_789 # ns precision
1001+
timestamp_ms, _ = divmod(timestamp_ns, 1_000_000)
1002+
fake_last_timestamp_v7 = timestamp_ms + 1
1003+
1004+
# counter_{hi,lo} are chosen so that "counter + 1" does not overflow
1005+
counter_hi = random.getrandbits(11)
1006+
counter_lo = random.getrandbits(29)
1007+
counter = (counter_hi << 30) | counter_lo
1008+
self.assertLess(counter + 1, 0x3ff_ffff_ffff)
1009+
1010+
tail_bytes = os.urandom(4)
1011+
tail = int.from_bytes(tail_bytes)
1012+
1013+
with (
1014+
mock.patch.multiple(
1015+
self.uuid,
1016+
_last_timestamp_v7=fake_last_timestamp_v7,
1017+
_last_counter_v7=counter,
1018+
),
1019+
mock.patch('time.time_ns', return_value=timestamp_ns),
1020+
mock.patch('os.urandom', return_value=tail_bytes) as urand
1021+
):
1022+
u = self.uuid.uuid7()
1023+
urand.assert_called_once_with(4)
1024+
equal(u.variant, self.uuid.RFC_4122)
1025+
equal(u.version, 7)
1026+
equal(self.uuid._last_timestamp_v7, fake_last_timestamp_v7 + 1)
1027+
unix_ts_ms = (fake_last_timestamp_v7 + 1) & 0xffff_ffff_ffff
1028+
equal((u.int >> 80) & 0xffff_ffff_ffff, unix_ts_ms)
1029+
# 42-bit counter advanced by 1
1030+
equal(self.uuid._last_counter_v7, counter + 1)
1031+
equal((u.int >> 64) & 0xfff, counter_hi)
1032+
# 42-bit counter advanced by 1 (counter_hi is untouched)
1033+
equal((u.int >> 32) & 0x3fff_ffff, counter_lo + 1)
1034+
equal(u.int & 0xffff_ffff, tail)
1035+
1036+
def test_uuid7_overflow_counter(self):
1037+
equal = self.assertEqual
1038+
# 1 Jan 2023 12:34:56.123_456_789
1039+
timestamp_ns = 1672533296_123_456_789 # ns precision
1040+
timestamp_ms, _ = divmod(timestamp_ns, 1_000_000)
1041+
1042+
new_counter_hi = random.getrandbits(11)
1043+
new_counter_lo = random.getrandbits(30)
1044+
new_counter = (new_counter_hi << 30) | new_counter_lo
1045+
1046+
tail = random.getrandbits(32)
1047+
random_bits = (new_counter << 32) | tail
1048+
random_data = random_bits.to_bytes(10)
1049+
1050+
with (
1051+
mock.patch.multiple(
1052+
self.uuid,
1053+
_last_timestamp_v7=timestamp_ms,
1054+
# same timestamp, but force an overflow on the counter
1055+
_last_counter_v7=0x3ff_ffff_ffff,
1056+
),
1057+
mock.patch('time.time_ns', return_value=timestamp_ns),
1058+
mock.patch('os.urandom', return_value=random_data) as urand
1059+
):
1060+
u = self.uuid.uuid7()
1061+
urand.assert_called_with(10)
1062+
equal(u.variant, self.uuid.RFC_4122)
1063+
equal(u.version, 7)
1064+
# timestamp advanced due to overflow
1065+
equal(self.uuid._last_timestamp_v7, timestamp_ms + 1)
1066+
unix_ts_ms = (timestamp_ms + 1) & 0xffff_ffff_ffff
1067+
equal((u.int >> 80) & 0xffff_ffff_ffff, unix_ts_ms)
1068+
# counter overflowed, so we picked a new one
1069+
equal(self.uuid._last_counter_v7, new_counter)
1070+
equal((u.int >> 64) & 0xfff, new_counter_hi)
1071+
equal((u.int >> 32) & 0x3fff_ffff, new_counter_lo)
1072+
equal(u.int & 0xffff_ffff, tail)
1073+
8741074
def test_uuid8(self):
8751075
equal = self.assertEqual
8761076
u = self.uuid.uuid8()

0 commit comments

Comments
 (0)