Skip to content

gh-93820: Pickle enum.Flag by name #93891

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 26, 2022

Conversation

serhiy-storchaka
Copy link
Member

@tiran
Copy link
Member

tiran commented Jun 16, 2022

A Flag instance with multiple members is serialized like this:

>>> (re.A | re.I | re.DOTALL | 0x1000).__reduce_ex__(1)
(<built-in function or_>, (re.ASCII, re.IGNORECASE|re.DOTALL|0x1000))

It looks like only the first member is reduced by name.

@serhiy-storchaka
Copy link
Member Author

The rest is reduced recursively.

>>> pickletools.dis(pickletools.optimize(pickle.dumps(re.A | re.I | re.DOTALL | 0x1000, protocol=3)))
    0: \x80 PROTO      3
    2: c    GLOBAL     '_operator or_'
   17: q    BINPUT     0
   19: c    GLOBAL     'builtins getattr'
   37: q    BINPUT     1
   39: c    GLOBAL     're RegexFlag'
   53: q    BINPUT     2
   55: X    BINUNICODE 'ASCII'
   65: \x86 TUPLE2
   66: R    REDUCE
   67: h    BINGET     0
   69: h    BINGET     1
   71: h    BINGET     2
   73: X    BINUNICODE 'IGNORECASE'
   88: \x86 TUPLE2
   89: R    REDUCE
   90: h    BINGET     0
   92: h    BINGET     1
   94: h    BINGET     2
   96: X    BINUNICODE 'DOTALL'
  107: \x86 TUPLE2
  108: R    REDUCE
  109: h    BINGET     2
  111: M    BININT2    4096
  114: \x85 TUPLE1
  115: R    REDUCE
  116: \x86 TUPLE2
  117: R    REDUCE
  118: \x86 TUPLE2
  119: R    REDUCE
  120: \x86 TUPLE2
  121: R    REDUCE
  122: .    STOP
highest protocol among opcodes = 2

@tiran
Copy link
Member

tiran commented Jun 16, 2022

copy.copy does not recurse. Is that ok?

>>> copy.copy(re.A | re.I | re.DOTALL | 0x1000)
re.ASCII 4114
re.ASCII|re.IGNORECASE|re.DOTALL|0x1000
>>> copy.deepcopy(re.A | re.I | re.DOTALL | 0x1000)
re.ASCII 4114
re.ASCII 0
re.IGNORECASE 4112
re.IGNORECASE 0
re.DOTALL 4096
re.DOTALL 0
re.ASCII|re.IGNORECASE|re.DOTALL|0x1000

output with print(m, rest) in __reduce_ex__.

@serhiy-storchaka
Copy link
Member Author

I think that it is okay. Enums are considered singletons (it allows to pickle them by name). It would be even better to add the __copy__ method which returns self.

This PR is not ideal:

  1. The residual member is reduced as re.RegexFlag(0x1000) instead of just 0x1000.
  2. Named flags are reduced as getattr(re.RegexFlag, 'ASCII') instead of global re.RegexFlag.ASCII or even re.ASCII.

It is a matter of the following optimizations.

@serhiy-storchaka serhiy-storchaka merged commit 5369858 into python:main Jun 26, 2022
@miss-islington
Copy link
Contributor

Thanks @serhiy-storchaka for the PR 🌮🎉.. I'm working now to backport this PR to: 3.11.
🐍🍒⛏🤖 I'm not a witch! I'm not a witch!

@serhiy-storchaka serhiy-storchaka deleted the enum-flags-pickle branch June 26, 2022 07:54
@miss-islington
Copy link
Contributor

Sorry, @serhiy-storchaka, I could not cleanly backport this to 3.11 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker 536985814a7116f14c9bc90aa1b3e3d36d5b2367 3.11

serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this pull request Jun 26, 2022
(cherry picked from commit 5369858)

Co-authored-by: Serhiy Storchaka <[email protected]>
@bedevere-bot
Copy link

GH-94288 is a backport of this pull request to the 3.11 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.11 only security fixes label Jun 26, 2022
serhiy-storchaka added a commit that referenced this pull request Jun 26, 2022
(cherry picked from commit 5369858)

Co-authored-by: Serhiy Storchaka <[email protected]>
@serhiy-storchaka serhiy-storchaka removed their assignment Sep 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants