Skip to content

Support UUIDv6, UUIDv7, and UUIDv8 from RFC 9562 #89083

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
stevesimmons mannequin opened this issue Aug 15, 2021 · 49 comments
Closed

Support UUIDv6, UUIDv7, and UUIDv8 from RFC 9562 #89083

stevesimmons mannequin opened this issue Aug 15, 2021 · 49 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@stevesimmons
Copy link
Mannequin

stevesimmons mannequin commented Aug 15, 2021

BPO 44920
Nosy @serhiy-storchaka, @loganasherjones, @stevesimmons

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2021-08-15.17:26:05.956>
labels = ['type-feature', 'library', '3.11']
title = 'Support UUIDv6, UUIDv7, and UUIDv8 from the new version of RFC4122'
updated_at = <Date 2021-08-17.00:36:53.563>
user = 'https://github.com/stevesimmons'

bugs.python.org fields:

activity = <Date 2021-08-17.00:36:53.563>
actor = 'loganasherjones'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2021-08-15.17:26:05.956>
creator = 'stevesimmons'
dependencies = []
files = []
hgrepos = []
issue_num = 44920
keywords = []
message_count = 2.0
messages = ['399624', '399647']
nosy_count = 3.0
nosy_names = ['serhiy.storchaka', 'loganasherjones', 'stevesimmons']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue44920'
versions = ['Python 3.11']

Linked PRs

Related

@stevesimmons
Copy link
Mannequin Author

stevesimmons mannequin commented Aug 15, 2021

Three new types of UUIDs have been proposed in the latest draft of the next version of RFC4122. Full text of that draft is in [1] (published 21 April 2021; draft period ends 21 Oct 2021).

Support for these should be included in uuid.py for Python 3.11, with backport for 3.9 and 3.10. The timetable for Python 3.11 should fit with the end of the IETF draft period.

Implementation should be similar to the existing UUID classes in uuid.py, the prototypes in [2], or even parts of my own uuid6 version [3].

[1] https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format
[2] https://github.com/uuid6/prototypes/tree/main/python
[3] https://github.com/stevesimmons/pyuuid6/blob/main/uuid6.py

@stevesimmons stevesimmons mannequin added 3.11 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Aug 15, 2021
@serhiy-storchaka
Copy link
Member

It is a new feature, and we usually do not backport new features to old Python versions, so it can only be included in Python 3.11 (backports can be provided by third-party libraries). Do you want to create a PR?

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@freundTech
Copy link
Contributor

Is there anyone currently working on this?

If not I'd like to have a look at implementing this.

@AlexWaygood AlexWaygood added 3.12 only security fixes and removed 3.11 only security fixes labels Sep 8, 2022
@ambv
Copy link
Contributor

ambv commented Oct 15, 2023

Note: the spec for UUIDv5 - UUIDv8 is still a draft, it's still being revised:
https://datatracker.ietf.org/doc/html/draft-ietf-uuidrev-rfc4122bis

Therefore, it is too early to add this to the Python standard library.

@erlend-aasland erlend-aasland added 3.13 bugs and security fixes and removed 3.12 only security fixes labels Jan 5, 2024
@shane-ns1
Copy link

UUIDv6, UUIDv7, and UUIDv8 are now in a standards-track RFC:

https://www.rfc-editor.org/rfc/rfc9562

@picnixz
Copy link
Member

picnixz commented Jun 17, 2024

I'll make a PR for this (I'm interested in those versions).

@hugovk hugovk added 3.14 bugs and security fixes and removed 3.13 bugs and security fixes labels Jun 17, 2024
@hugovk hugovk changed the title Support UUIDv6, UUIDv7, and UUIDv8 from the new version of RFC4122 Support UUIDv6, UUIDv7, and UUIDv8 from RFC 9562 Jun 17, 2024
@picnixz

This comment was marked as resolved.

@gpshead
Copy link
Member

gpshead commented Jun 22, 2024

FYI - there are PyPI packages from people in the community attempting to come up with ways to use UUID v6-8 today:

What we'd be seeking to do within the stdlib is settle upon how these should fit as features into the standard library's existing uuid APIs. (here and with your other fields related issue)

@picnixz
Copy link
Member

picnixz commented Jun 22, 2024

Actually, I first tried an implementation based on those packages but after reading the RFC again, I was wondering: "which is the best course of action for the standard library?" and thus I decided to pick the (only) possible variant of v6 where the implementation is RFC-compliant (and then I hit the issue with the fields...) and for v7 and v8, I decided to first take the generic one (and made an alternative for v7 using monotonicity as specified in the RFC alternatives). I did not decide anything on v8 since discussion should first be done.

Note that oittaa's v7 is more or less like #120650 (non-monotonous sub-sec v7) since it follows the basic RFC but Simmons' v7 seems to follow the alternative (Method 3) combined with Method 1, §6.2 (Fixed Bit-Length Dedicated Counter) whereas #120830 is Method 3 combined with Method 2, §6.2 (Monotonic Random). I say "seems to" because it's not really clear whether the RFC allows mixing Method 1 & Method 3 (Method 1 forces the counter to immediately follow the 48-bit timestamp part but Method 3 says that the sub-seconds precision should be at that place so...). Method 2 explicitly tells me that I need to use the last 62 bits to make whatever I need so it's closer to RFC compliance.

Actually, there are more prototypes that I found last week: https://github.com/uuid6/prototypes, and they like to differ in the implementation of v7 and v8... For v6, the implementation is RFC-decided so we don't need to bother with a discussion, just the other issue on the fields. For v7/v8, do you think we need a Discourse (different from https://discuss.python.org/t/add-uuid7-in-uuid-module-in-standard-library/44390/7) & a PEP perhaps?

There's also https://github.com/uuid-rs/uuid which uses the same techniques that I presented in the first PR (namely, UUIDv7 has 80-bit security and UUIDv8 has custom chunks).

vstinner pushed a commit that referenced this issue Nov 12, 2024
@vstinner
Copy link
Member

Change 03924b5 added uuid.uuid8().

picnixz added a commit to picnixz/cpython that referenced this issue Dec 8, 2024
picnixz added a commit to picnixz/cpython that referenced this issue Dec 8, 2024
encukou pushed a commit that referenced this issue Jan 3, 2025
WolframAlph pushed a commit to WolframAlph/cpython that referenced this issue Jan 4, 2025
srinivasreddy pushed a commit to srinivasreddy/cpython that referenced this issue Jan 8, 2025
ebonnal pushed a commit to ebonnal/cpython that referenced this issue Jan 12, 2025
ebonnal pushed a commit to ebonnal/cpython that referenced this issue Jan 12, 2025
@picnixz
Copy link
Member

picnixz commented Jan 19, 2025

I've opened a poll concerning the implementation that the community would think it is better:

https://discuss.python.org/t/rfc-4122-9562-uuid-version-7-and-8-implementation/56725/3.

I would like those interested in UUIDv7 to reply on the thread / poll if possible.

TL;DR: I think it's better to use Method 3 than Method 1 because Method 3 aligns with PostgreSQL and I suspect UUIDs will likely be used more by microservices in general (so they would want the same rationale as PostgreSQL).

@piranna
Copy link

piranna commented Jan 19, 2025

I would like to be used my implementation, the one offering all the options according to the spec as opt-ins, instead of one with enabled not-default presets.

@picnixz
Copy link
Member

picnixz commented Jan 19, 2025

The problem is that the default opt-in is not really what users want. It does not help having only ms precision and 74 bits of randomness. The reason is that it would simply look like UUIDv6 (yet to be merged) with a random clock sequence and a random node address, because that's more or less what it does from an entropy PoV. So the issue is more "what is the best default opt-in" and if we don't have yet the answer to that, adding uuid7() wouldn't be very helpful for the standard library.

picnixz added a commit that referenced this issue Mar 2, 2025
Add support for generating UUIDv6 objects according to RFC 9562, §5.6 [1].

The functionality is provided by the `uuid.uuid6()` function which takes as inputs an optional 48-bit
hardware address and an optional 14-bit clock sequence. The UUIDv6 temporal fields are ordered
differently than those of UUIDv1, thereby providing improved database locality.

[1]: https://www.rfc-editor.org/rfc/rfc9562.html#section-5.6

---------

Co-authored-by: Hugo van Kemenade <[email protected]>
Co-authored-by: Victor Stinner <[email protected]>
picnixz added a commit that referenced this issue Mar 4, 2025
Add support for generating UUIDv7 objects according to RFC 9562, §5.7 [1].

The functionality is provided by the `uuid.uuid7()` function. The implementation
is based on a 42-bit counter as described by Method 1, §6.2 [2] and guarantees
monotonicity within the same millisecond.

[1]: https://www.rfc-editor.org/rfc/rfc9562.html#section-5.7
[2]: https://www.rfc-editor.org/rfc/rfc9562.html#section-6.2

---------

Co-authored-by: Hugo van Kemenade <[email protected]>
Co-authored-by: Victor Stinner <[email protected]>
Co-authored-by: Éric <[email protected]>
@picnixz
Copy link
Member

picnixz commented Mar 4, 2025

The first implementation of UUIDv6/7/8 has landed. We will come back later if people want to have more than just the simple implementation for the standard library, but the next alpha will be useful to gather feedback! I thank everyone that was involved in the discussion as it was definitely helpful.

@picnixz picnixz closed this as completed Mar 4, 2025
@sscherfke
Copy link

Oh, noes! 3h too late for an important question! 🙀 😉

One use-case for UUID v7 is using it as PK in databases. Since it is time based, it can also be used as partition key (e.g., to use one partion for each day). In order to calculate the partition range, you need calculate the "minimal" UUID for a given date (i.e., YYYY-MM-DD 00:00:00 and use all zeros for the random bits => 068a7b30-0000-7000-8000-000000000000).

I'm totally fine with uuid.uuid7() not taking any arguments, but it would be cool if the building blocks for generating a UUID v7 based on custom unix_ts_ms, counter_hi , counter_lo, random could be exposed as well.

@sscherfke
Copy link

What I want is something like this:

def min_uuid7(date: datetime.datetime | None) -> UUID:
    # This is just for convenience and could be left out
    if date is None:
        today = datetime.date.today()
        date = datetime.datetime(
            today.year, today.month, today.day, tzinfo=datetime.UTC
        )
    # Provide a custom timestamp and a custom counter and tail
    timestamp_ms = int(date.timestamp() * 1_000)
    counter, tail = 0, 0

    # The remainder is the same as in uuid7():
    unix_ts_ms = timestamp_ms & 0xFFFF_FFFF_FFFF
    counter_msbs = counter >> 30
    # keep 12 counter's MSBs and clear variant bits
    counter_hi = counter_msbs & 0x0FFF
    # keep 30 counter's LSBs and clear version bits
    counter_lo = counter & 0x3FFF_FFFF
    # ensure that the tail is always a 32-bit integer (by construction,
    # it is already the case, but future interfaces may allow the user
    # to specify the random tail)
    tail &= 0xFFFF_FFFF
    
    int_uuid_7 = unix_ts_ms << 80
    int_uuid_7 |= counter_hi << 64
    int_uuid_7 |= counter_lo << 32
    int_uuid_7 |= tail
    # by construction, the variant and version bits are already cleared
    int_uuid_7 |= _RFC_4122_VERSION_7_FLAGS
    return UUID(int=int_uuid_7)

@vstinner
Copy link
Member

vstinner commented Mar 4, 2025

I suggest to open a new issue for your feature request.

seehwan pushed a commit to seehwan/cpython that referenced this issue Apr 16, 2025
)

Add support for generating UUIDv6 objects according to RFC 9562, §5.6 [1].

The functionality is provided by the `uuid.uuid6()` function which takes as inputs an optional 48-bit
hardware address and an optional 14-bit clock sequence. The UUIDv6 temporal fields are ordered
differently than those of UUIDv1, thereby providing improved database locality.

[1]: https://www.rfc-editor.org/rfc/rfc9562.html#section-5.6

---------

Co-authored-by: Hugo van Kemenade <[email protected]>
Co-authored-by: Victor Stinner <[email protected]>
seehwan pushed a commit to seehwan/cpython that referenced this issue Apr 16, 2025
)

Add support for generating UUIDv7 objects according to RFC 9562, §5.7 [1].

The functionality is provided by the `uuid.uuid7()` function. The implementation
is based on a 42-bit counter as described by Method 1, §6.2 [2] and guarantees
monotonicity within the same millisecond.

[1]: https://www.rfc-editor.org/rfc/rfc9562.html#section-5.7
[2]: https://www.rfc-editor.org/rfc/rfc9562.html#section-6.2

---------

Co-authored-by: Hugo van Kemenade <[email protected]>
Co-authored-by: Victor Stinner <[email protected]>
Co-authored-by: Éric <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests