-
Notifications
You must be signed in to change notification settings - Fork 1.1k
PYTHON-5061 - Add an API to extend the bson TypeRegistry #2345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bson/codec_options.py
Outdated
if isinstance(codec, TypeEncoder): | ||
self._validate_type_encoder(codec) | ||
is_valid_codec = True | ||
self._encoder_map[codec.python_type] = codec.transform_python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there thread safety concerns here? What happens when add_codec is called when the registry is being used for bson encoding/decoding? What about with the C extensions and/or free threading mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If add_codec
is called while the registry is in use, the new codec added would only be available for encoding/decoding after the add_codec
call completes. Yes, this is technically not thread-safe, but is extending a TypeRegistry
while already encoding/decoding something we actually expect users to do?
If it is, avoiding that race condition would require us to detect if a TypeRegistry
is already in use for encoding/decoding and block add_codec
until the processing completes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose we keep the Registry immutable. After all, CodecOptions is intended to be immmutable and the registry is part of CodecOptions. This new api should return a new registry instance and keep the old one unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason CodecOptions
is intended to be immutable besides the race condition outlined above? Returning a new registry instance would then require users to create a new CodecOptions
with that registry, along with all of the other settings they may have configured in the original. That seems significantly more cumbersome to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mutating is problematic. Consider one section of your app is using CodecOptions, and then another unrelated thread changes some of the settings (eg via add_codec or some other option), that could violate the assumptions of the other code and lead to TypeErrors.
I think I would even prefer an api like this:
orig_codecs = coll.codec_options.type_registry.codecs
new_registry = TypeRegistry([new_codec]+orig_codecs)
new_coll = coll.with_options(codec_options=coll.codec_options.with_options(type_registry=new_registry))
All we need to add is a type_registry.codecs
property. Sure this is cumbersome but it does the job, follows our existing conventions, and should be rarely used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're saying instead of adding an API to add codecs, just add an API to return all of the current codecs? Ignoring the double underscore on TypeRegistry.__type_codecs
, doesn't that already exist and make this change unneeded in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but __type_codecs
is private. We need to expose it in a public api (as well as fallback_encoder
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the change is making those two attributes publicly available. I like that approach.
@ShaneHarvey made a good point that this can be done with fewer potentially far-reaching changes by simply making the following pattern available publicly:
|
doc/changelog.rst
Outdated
-------------------------------------- | ||
PyMongo 4.14 brings a number of changes including: | ||
|
||
- Added :meth:`bson.codec_options.TypeRegistry.codecs` and :meth:`bson.codec_options.TypeRegistry.fallback_encoder` properties |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, we should use :attr:
https://www.sphinx-doc.org/en/master/usage/domains/python.html#role-py-attr
No description provided.