-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Add MNIST training demo with layers API. #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewed 11 of 11 files at r1. mnist/data.js, line 77 at r1 (raw file):
just confirming, this line makes the pixels be between [0,1] right? mnist/index.js, line 54 at r1 (raw file):
this will need to be updated once optimizers merge right? mnist/index.js, line 70 at r1 (raw file):
why does it need to know the batchSize? Is model.fit smart enough to do its own batching, given the whole dataset? If yes, I see why we don't want to pass whole dataset yet. Ideally we want to give it a stream , and have model.fit dispose older batches as it consumes new ones. It will all come together with tf.data mnist/ui.js, line 18 at r1 (raw file):
is this unused? mnist-core/data.js, line 46 at r1 (raw file):
changed your mind? :) i'm 100% ok either way. Comments from Reviewable |
Review status: all files reviewed at latest revision, 6 unresolved discussions, all commit checks successful. mnist/data.js, line 3 at r1 (raw file):
Google LLC (here and throughout) mnist/index.js, line 70 at r1 (raw file): Previously, dsmilkov (Daniel Smilkov) wrote…
Model.fit() will indeed do its own batching and training loop, but since you already have a different batching approach here this is fine as is, too. +1 tfjs-data will solve all this. I already have the core API version of this demo working that way (i.e., replaced data.js). Comments from Reviewable |
Reviewed 11 of 11 files at r1. Comments from Reviewable |
Review status: all files reviewed at latest revision, 6 unresolved discussions, all commit checks successful. mnist/index.js, line 54 at r1 (raw file): Previously, dsmilkov (Daniel Smilkov) wrote…
Yeah once Shanqing's change makes it through our npm build and into tfjs-examples :) Added a comment mnist/index.js, line 70 at r1 (raw file): Previously, davidsoergel (David Soergel) wrote…
batchSize defaults to 32 if you pass nothing, so effectively required. Generally speaking, the way you're supposed to use fit is either passing the entire dataset as x & y This means the entire dataset would have to fit into GPU memory, and I didn't want to encourage that for people who fork this given the current API. The other way to implement this would be to have x, y be of shape [N * B, D], but I figured that wasn't worth the extra complexity since I want to keep lines short here. WDYT? mnist/ui.js, line 18 at r1 (raw file): Previously, dsmilkov (Daniel Smilkov) wrote…
Done. Comments from Reviewable |
Review status: 9 of 11 files reviewed at latest revision, 6 unresolved discussions, all commit checks successful. mnist/data.js, line 3 at r1 (raw file): Previously, davidsoergel (David Soergel) wrote…
Done. mnist/data.js, line 77 at r1 (raw file): Previously, dsmilkov (Daniel Smilkov) wrote…
Yeah, do you think I should make it more explicit? mnist-core/data.js, line 46 at r1 (raw file): Previously, dsmilkov (Daniel Smilkov) wrote…
Yeah, I changed my mind :) fetch has another meaning and we use "load" elsewhere (loadModel). Comments from Reviewable |
Review status: 5 of 18 files reviewed at latest revision, 7 unresolved discussions, all commit checks successful. mnist/index.js, line 76 at r2 (raw file):
What's the best practice for choosing tidy() vs. dispose()? Comments from Reviewable |
Review status: 5 of 18 files reviewed at latest revision, 7 unresolved discussions, all commit checks successful. mnist/index.js, line 76 at r2 (raw file): Previously, caisq (Shanqing Cai) wrote…
Basically do whatever makes sense. In this case, I can't use a tidy() around this whole block here because fit() is async and tidy() cannot be used with an asynchronous function. In general, tidy() works nice when there's a chain of operations and you want to clean up intermediates, and dispose() works better when there are only 1 or 2 In tf.data, we can have a way to tell a stream to "dispose the last tensor you created" when you call getNext() so this goes away for the training loop. Comments from Reviewable |
Review status: 5 of 18 files reviewed at latest revision, 6 unresolved discussions, all commit checks successful. mnist/index.js, line 70 at r1 (raw file): Previously, nsthorat (Nikhil Thorat) wrote…
Yep leave this as is for now-- I was just agreeing that the streaming solution is coming soon anyway :) Comments from Reviewable |
Review status: 5 of 18 files reviewed at latest revision, 6 unresolved discussions, all commit checks successful. Comments from Reviewable |
This is the same model as the mnist-core demo.
This change is