Skip to content
This repository was archived by the owner on Apr 23, 2025. It is now read-only.

Migrating the COCO dataset to Epochs #606

Merged
merged 6 commits into from
Jun 15, 2020

Conversation

BradLarson
Copy link
Contributor

In response to part of issue #592, this migrates the COCO dataset to the Epochs API. Further cleanup might be required to better align the lazy object detection pipeline with Epochs and its new capabilities, but this provides the same functionality as before and removes deprecation warnings.

As this was the last use of TensorPair's _Collatable functionality, that has been removed. Batcher dependencies have also been removed from the Datasets module.

Additionally, copyright headers were missing from several files and have been added.

@BradLarson BradLarson requested review from xihui-wu and shabalind June 12, 2020 23:45
import Foundation
import TensorFlow

public struct COCODataset<Entropy: RandomNumberGenerator> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously the dataset exposed access to the underlying array of [ObjectDetectionExample] and it was useful to do custom preprocessing before it's converted to the batcher/epochs. I wonder if this PR can preserve this functionality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For custom preprocessing before delivery to anything downstream, makeBatch() is the intended customization point. In fact, I think we'll eventually want to shift from using the LazyImage type to just providing a URL and having the makeBatch() perform all necessary processing at time of use. This is done in the Imagenette dataset, for example, with lazy loading of images coming from input URLs at the point of makeBatch().

What if I added a settable mapping function of ObjectDetectionExample -> ObjectDetectionExample that was called within makeBatch(), where any custom preprocessing could be specified for a given instance of the dataset? Or makeBatch() itself could be a user-provided function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new transform parameter to the dataset creation that allows for the insertion of a custom ObjectDetectionExample -> ObjectDetectionExample mapping to be performed on each example. This will occur within makeBatch(), and by default we'll have an identity mapping. Again, This provides a starting point for working with the dataset, but I think we'll want to do a more thorough reorganization later to take full advantage of the Epochs design. That will be motivated by the examples you're working on.

For now, I'd like to make sure we've preserved functionality while cleaning up the last deprecation warnings before another stable branch cut.

@BradLarson BradLarson merged commit 0f11d8d into tensorflow:master Jun 15, 2020
@BradLarson BradLarson deleted the COCOEpochs branch July 7, 2020 15:16
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants