Fix issue with transformers library huggingface #11027
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
I was trying to modifiy a RLDS dataset built on top of TFDS following this repo : https://github.com/kpertsch/rlds_dataset_builder
I needed to extract some features from images with models from the transformers library of HuggingFace but was facing an issue during the import :
raise ValueError('{}.__spec__ is None'.format(name)) ValueError: datasets.__spec__ is None
And more specifically this one :
transformers/utils/import_utils.py", line 120, in <module> _datasets_available = _is_package_available("datasets")
It verifies if
datasets
(the HF library) is available by looking at the__spec__
attribute. As tfds is overwritting datasets by a mock, it does not create the attribute, which causes the issue.In the PR I fixed the issue by simply creating the needed attribute in the Mock in order to solve the problem.
Here are the versions of the libraries involved in the problem :
tensorflow-datasets 4.9.3
transformers 4.50.0.dev0