Skip to content

Add MNIST training demo with layers API. #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 14, 2018
Merged

Add MNIST training demo with layers API. #8

merged 4 commits into from
Mar 14, 2018

Conversation

nsthorat
Copy link

@nsthorat nsthorat commented Mar 14, 2018

This is the same model as the mnist-core demo.

This change is Reviewable

@nsthorat nsthorat requested review from dsmilkov and caisq March 14, 2018 02:15
@dsmilkov
Copy link
Contributor

:lgtm_strong:


Reviewed 11 of 11 files at r1.
Review status: all files reviewed at latest revision, all discussions resolved.


mnist/data.js, line 77 at r1 (raw file):

            // All channels hold an equal value since the image is grayscale, so
            // just read the red channel.
            datasetBytesView[j] = imageData.data[j * 4] / 255;

just confirming, this line makes the pixels be between [0,1] right?


mnist/index.js, line 54 at r1 (raw file):

});

const optimizer = new tf.optimizers.SGD({lr: LEARNING_RATE});

this will need to be updated once optimizers merge right?


mnist/index.js, line 70 at r1 (raw file):

      x: batch.xs.reshape([BATCH_SIZE, 28, 28, 1]),
      y: batch.labels,
      batchSize: BATCH_SIZE,

why does it need to know the batchSize? Is model.fit smart enough to do its own batching, given the whole dataset? If yes, I see why we don't want to pass whole dataset yet. Ideally we want to give it a stream , and have model.fit dispose older batches as it consumes new ones. It will all come together with tf.data


mnist/ui.js, line 18 at r1 (raw file):

 */

import * as tf from '@tensorflow/tfjs';

is this unused?


mnist-core/data.js, line 46 at r1 (raw file):

  }

  async load() {

changed your mind? :) i'm 100% ok either way.


Comments from Reviewable

@davidsoergel
Copy link
Member

Review status: all files reviewed at latest revision, 6 unresolved discussions, all commit checks successful.


mnist/data.js, line 3 at r1 (raw file):

/**
 * @license
 * Copyright 2018 Google Inc. All Rights Reserved.

Google LLC (here and throughout)


mnist/index.js, line 70 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

why does it need to know the batchSize? Is model.fit smart enough to do its own batching, given the whole dataset? If yes, I see why we don't want to pass whole dataset yet. Ideally we want to give it a stream , and have model.fit dispose older batches as it consumes new ones. It will all come together with tf.data

Model.fit() will indeed do its own batching and training loop, but since you already have a different batching approach here this is fine as is, too. +1 tfjs-data will solve all this. I already have the core API version of this demo working that way (i.e., replaced data.js).


Comments from Reviewable

@davidsoergel
Copy link
Member

:lgtm_strong:


Reviewed 11 of 11 files at r1.
Review status: all files reviewed at latest revision, 6 unresolved discussions, all commit checks successful.


Comments from Reviewable

@nsthorat
Copy link
Author

Review status: all files reviewed at latest revision, 6 unresolved discussions, all commit checks successful.


mnist/index.js, line 54 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

this will need to be updated once optimizers merge right?

Yeah once Shanqing's change makes it through our npm build and into tfjs-examples :) Added a comment


mnist/index.js, line 70 at r1 (raw file):

Previously, davidsoergel (David Soergel) wrote…

Model.fit() will indeed do its own batching and training loop, but since you already have a different batching approach here this is fine as is, too. +1 tfjs-data will solve all this. I already have the core API version of this demo working that way (i.e., replaced data.js).

batchSize defaults to 32 if you pass nothing, so effectively required. Generally speaking, the way you're supposed to use fit is either passing the entire dataset as x & y Tensors (internally fit slices randomly to make batches), or you would use a "generator" function which isn't yet implemented and would be the tf.data streams approach.

This means the entire dataset would have to fit into GPU memory, and I didn't want to encourage that for people who fork this given the current API. The other way to implement this would be to have x, y be of shape [N * B, D], but I figured that wasn't worth the extra complexity since I want to keep lines short here.

WDYT?


mnist/ui.js, line 18 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

is this unused?

Done.


Comments from Reviewable

@nsthorat
Copy link
Author

Review status: 9 of 11 files reviewed at latest revision, 6 unresolved discussions, all commit checks successful.


mnist/data.js, line 3 at r1 (raw file):

Previously, davidsoergel (David Soergel) wrote…

Google LLC (here and throughout)

Done.


mnist/data.js, line 77 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

just confirming, this line makes the pixels be between [0,1] right?

Yeah, do you think I should make it more explicit?


mnist-core/data.js, line 46 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

changed your mind? :) i'm 100% ok either way.

Yeah, I changed my mind :) fetch has another meaning and we use "load" elsewhere (loadModel).


Comments from Reviewable

@caisq
Copy link
Collaborator

caisq commented Mar 14, 2018

Review status: 5 of 18 files reviewed at latest revision, 7 unresolved discussions, all commit checks successful.


mnist/index.js, line 76 at r2 (raw file):

batch.xs.dispose();

What's the best practice for choosing tidy() vs. dispose()?


Comments from Reviewable

@nsthorat
Copy link
Author

Review status: 5 of 18 files reviewed at latest revision, 7 unresolved discussions, all commit checks successful.


mnist/index.js, line 76 at r2 (raw file):

Previously, caisq (Shanqing Cai) wrote…

batch.xs.dispose();

What's the best practice for choosing tidy() vs. dispose()?

Basically do whatever makes sense. In this case, I can't use a tidy() around this whole block here because fit() is async and tidy() cannot be used with an asynchronous function.

In general, tidy() works nice when there's a chain of operations and you want to clean up intermediates, and dispose() works better when there are only 1 or 2 Tensors to clean up (like predict / fit).

In tf.data, we can have a way to tell a stream to "dispose the last tensor you created" when you call getNext() so this goes away for the training loop.


Comments from Reviewable

@davidsoergel
Copy link
Member

Review status: 5 of 18 files reviewed at latest revision, 6 unresolved discussions, all commit checks successful.


mnist/index.js, line 70 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

batchSize defaults to 32 if you pass nothing, so effectively required. Generally speaking, the way you're supposed to use fit is either passing the entire dataset as x & y Tensors (internally fit slices randomly to make batches), or you would use a "generator" function which isn't yet implemented and would be the tf.data streams approach.

This means the entire dataset would have to fit into GPU memory, and I didn't want to encourage that for people who fork this given the current API. The other way to implement this would be to have x, y be of shape [N * B, D], but I figured that wasn't worth the extra complexity since I want to keep lines short here.

WDYT?

Yep leave this as is for now-- I was just agreeing that the streaming solution is coming soon anyway :)


Comments from Reviewable

@nsthorat nsthorat requested a review from tafsiri March 14, 2018 14:38
@caisq
Copy link
Collaborator

caisq commented Mar 14, 2018

:lgtm_strong:


Review status: 5 of 18 files reviewed at latest revision, 6 unresolved discussions, all commit checks successful.


Comments from Reviewable

@nsthorat nsthorat merged commit 2491923 into master Mar 14, 2018
@nsthorat nsthorat deleted the mnist branch March 14, 2018 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants