-
-
Notifications
You must be signed in to change notification settings - Fork 32k
Support UUIDv6, UUIDv7, and UUIDv8 from RFC 9562 #89083
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Three new types of UUIDs have been proposed in the latest draft of the next version of RFC4122. Full text of that draft is in [1] (published 21 April 2021; draft period ends 21 Oct 2021). Support for these should be included in uuid.py for Python 3.11, with backport for 3.9 and 3.10. The timetable for Python 3.11 should fit with the end of the IETF draft period. Implementation should be similar to the existing UUID classes in uuid.py, the prototypes in [2], or even parts of my own uuid6 version [3]. [1] https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format |
It is a new feature, and we usually do not backport new features to old Python versions, so it can only be included in Python 3.11 (backports can be provided by third-party libraries). Do you want to create a PR? |
Is there anyone currently working on this? If not I'd like to have a look at implementing this. |
Note: the spec for UUIDv5 - UUIDv8 is still a draft, it's still being revised: Therefore, it is too early to add this to the Python standard library. |
UUIDv6, UUIDv7, and UUIDv8 are now in a standards-track RFC: |
I'll make a PR for this (I'm interested in those versions). |
This comment was marked as resolved.
This comment was marked as resolved.
FYI - there are PyPI packages from people in the community attempting to come up with ways to use UUID v6-8 today:
What we'd be seeking to do within the stdlib is settle upon how these should fit as features into the standard library's existing |
Actually, I first tried an implementation based on those packages but after reading the RFC again, I was wondering: "which is the best course of action for the standard library?" and thus I decided to pick the (only) possible variant of v6 where the implementation is RFC-compliant (and then I hit the issue with the fields...) and for v7 and v8, I decided to first take the generic one (and made an alternative for v7 using monotonicity as specified in the RFC alternatives). I did not decide anything on v8 since discussion should first be done. Note that oittaa's v7 is more or less like #120650 (non-monotonous sub-sec v7) since it follows the basic RFC but Simmons' v7 seems to follow the alternative (Method 3) combined with Method 1, §6.2 (Fixed Bit-Length Dedicated Counter) whereas #120830 is Method 3 combined with Method 2, §6.2 (Monotonic Random). I say "seems to" because it's not really clear whether the RFC allows mixing Method 1 & Method 3 (Method 1 forces the counter to immediately follow the 48-bit timestamp part but Method 3 says that the sub-seconds precision should be at that place so...). Method 2 explicitly tells me that I need to use the last 62 bits to make whatever I need so it's closer to RFC compliance. Actually, there are more prototypes that I found last week: https://github.com/uuid6/prototypes, and they like to differ in the implementation of v7 and v8... For v6, the implementation is RFC-decided so we don't need to bother with a discussion, just the other issue on the fields. For v7/v8, do you think we need a Discourse (different from https://discuss.python.org/t/add-uuid7-in-uuid-module-in-standard-library/44390/7) & a PEP perhaps? There's also https://github.com/uuid-rs/uuid which uses the same techniques that I presented in the first PR (namely, UUIDv7 has 80-bit security and UUIDv8 has custom chunks). |
Co-authored-by: Hugo van Kemenade <[email protected]>
Change 03924b5 added |
) Co-authored-by: Hugo van Kemenade <[email protected]>
improve UUIDv8 uniqueness tests
improve UUIDv8 uniqueness tests
) Co-authored-by: Hugo van Kemenade <[email protected]>
I've opened a poll concerning the implementation that the community would think it is better: https://discuss.python.org/t/rfc-4122-9562-uuid-version-7-and-8-implementation/56725/3. I would like those interested in UUIDv7 to reply on the thread / poll if possible. TL;DR: I think it's better to use Method 3 than Method 1 because Method 3 aligns with PostgreSQL and I suspect UUIDs will likely be used more by microservices in general (so they would want the same rationale as PostgreSQL). |
I would like to be used my implementation, the one offering all the options according to the spec as opt-ins, instead of one with enabled not-default presets. |
The problem is that the default opt-in is not really what users want. It does not help having only ms precision and 74 bits of randomness. The reason is that it would simply look like UUIDv6 (yet to be merged) with a random clock sequence and a random node address, because that's more or less what it does from an entropy PoV. So the issue is more "what is the best default opt-in" and if we don't have yet the answer to that, adding |
Add support for generating UUIDv6 objects according to RFC 9562, §5.6 [1]. The functionality is provided by the `uuid.uuid6()` function which takes as inputs an optional 48-bit hardware address and an optional 14-bit clock sequence. The UUIDv6 temporal fields are ordered differently than those of UUIDv1, thereby providing improved database locality. [1]: https://www.rfc-editor.org/rfc/rfc9562.html#section-5.6 --------- Co-authored-by: Hugo van Kemenade <[email protected]> Co-authored-by: Victor Stinner <[email protected]>
Add support for generating UUIDv7 objects according to RFC 9562, §5.7 [1]. The functionality is provided by the `uuid.uuid7()` function. The implementation is based on a 42-bit counter as described by Method 1, §6.2 [2] and guarantees monotonicity within the same millisecond. [1]: https://www.rfc-editor.org/rfc/rfc9562.html#section-5.7 [2]: https://www.rfc-editor.org/rfc/rfc9562.html#section-6.2 --------- Co-authored-by: Hugo van Kemenade <[email protected]> Co-authored-by: Victor Stinner <[email protected]> Co-authored-by: Éric <[email protected]>
The first implementation of UUIDv6/7/8 has landed. We will come back later if people want to have more than just the simple implementation for the standard library, but the next alpha will be useful to gather feedback! I thank everyone that was involved in the discussion as it was definitely helpful. |
Oh, noes! 3h too late for an important question! 🙀 😉 One use-case for UUID v7 is using it as PK in databases. Since it is time based, it can also be used as partition key (e.g., to use one partion for each day). In order to calculate the partition range, you need calculate the "minimal" UUID for a given date (i.e., I'm totally fine with |
What I want is something like this: def min_uuid7(date: datetime.datetime | None) -> UUID:
# This is just for convenience and could be left out
if date is None:
today = datetime.date.today()
date = datetime.datetime(
today.year, today.month, today.day, tzinfo=datetime.UTC
)
# Provide a custom timestamp and a custom counter and tail
timestamp_ms = int(date.timestamp() * 1_000)
counter, tail = 0, 0
# The remainder is the same as in uuid7():
unix_ts_ms = timestamp_ms & 0xFFFF_FFFF_FFFF
counter_msbs = counter >> 30
# keep 12 counter's MSBs and clear variant bits
counter_hi = counter_msbs & 0x0FFF
# keep 30 counter's LSBs and clear version bits
counter_lo = counter & 0x3FFF_FFFF
# ensure that the tail is always a 32-bit integer (by construction,
# it is already the case, but future interfaces may allow the user
# to specify the random tail)
tail &= 0xFFFF_FFFF
int_uuid_7 = unix_ts_ms << 80
int_uuid_7 |= counter_hi << 64
int_uuid_7 |= counter_lo << 32
int_uuid_7 |= tail
# by construction, the variant and version bits are already cleared
int_uuid_7 |= _RFC_4122_VERSION_7_FLAGS
return UUID(int=int_uuid_7) |
I suggest to open a new issue for your feature request. |
) Add support for generating UUIDv6 objects according to RFC 9562, §5.6 [1]. The functionality is provided by the `uuid.uuid6()` function which takes as inputs an optional 48-bit hardware address and an optional 14-bit clock sequence. The UUIDv6 temporal fields are ordered differently than those of UUIDv1, thereby providing improved database locality. [1]: https://www.rfc-editor.org/rfc/rfc9562.html#section-5.6 --------- Co-authored-by: Hugo van Kemenade <[email protected]> Co-authored-by: Victor Stinner <[email protected]>
) Add support for generating UUIDv7 objects according to RFC 9562, §5.7 [1]. The functionality is provided by the `uuid.uuid7()` function. The implementation is based on a 42-bit counter as described by Method 1, §6.2 [2] and guarantees monotonicity within the same millisecond. [1]: https://www.rfc-editor.org/rfc/rfc9562.html#section-5.7 [2]: https://www.rfc-editor.org/rfc/rfc9562.html#section-6.2 --------- Co-authored-by: Hugo van Kemenade <[email protected]> Co-authored-by: Victor Stinner <[email protected]> Co-authored-by: Éric <[email protected]>
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
Related
fields
andtime_*
properties must not be used on UUIDs that are time-agnostic. #120878The text was updated successfully, but these errors were encountered: