Skip to content

Add support for different TFDS BuilderConfigs #1239

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 28, 2022

Conversation

dedeswim
Copy link

@dedeswim dedeswim commented Apr 28, 2022

As of now a user can't use a TFDS BuilderConfig different than the default one: the current parser assumes that the dataset is in the form {torch,tfds,...}/dataset, while in the case of TFDS datasets it can be tfds/dataset/builder-type, e.g. tfds/diabetic_retinopathy_detection/btgraham-300 (as in here).

This PR fixes this by taking whatever is split at line 10 after the first element (which is the source of the dataset) and joining it again with /.

In the case there is just one /, then name[1:] will contain just one element, e.g., if we pass tfds/imagenet2012, then name[1:] == ["imagenet2012"] and "/".join(name[1:]) == "imagenet2012", leaving the current behavior unchanged. If instead there are multiple / (e.g. "tfds/diabetic_retinopathy_detection/btgraham-300"), then name[1:] == ["diabetic_retinopathy_detection", "btgraham-300"], and "/".join(name[1:]) == "diabetic_retinopathy_detection/btgraham-300", which is compatible with the tfds format.

@dedeswim
Copy link
Author

I expanded a bit the PR description to clarify how and why this change works

@rwightman rwightman merged commit 9c321be into huggingface:bits_and_tpu Apr 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants