Skip to content

Add initializers #116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 58 commits into from
Oct 8, 2020
Merged

Add initializers #116

merged 58 commits into from
Oct 8, 2020

Conversation

JimClarke5
Copy link
Contributor

This PR adds initializers to tensorflow-framework. The classes inherit from BaseInitializer which implements Initializer which is also defined as a Functional interface so that an Initializer can be a lambda. Unit Test cases are also provided.

I have also included utils.ShapeUtils. Some methods in ShapeUtils,

public static boolean isCompatibleWith(Shape a, Shape b),
public static Shape reduce(Shape shape, int axis)
public static boolean isUnknownShape(Shape a)

might be considered to be a part of org.tensorflow.ndarray.Shape. The other methods bridge Operands to a org.tensorflow.ndarray.Shape, so probably don't belong in the ndarray module.

This PR is not dependent on any other PR.

…hod. This allows the NAME to be used elsewhere instead of hardcoding the string.
…coding the string.

added methods isFloating(), isInteger(), isNUmeric(), isBoolean() and isString()
…EntropyWitLogits()

Added tf.nn.sparesSoftmaxCrossEntropyWithLogits() and
tf.nn.raw.sparesSoftmaxCrossEntropyWithLogits()

Added tf.nn.sigmoidCrossEntropyWithLogits()
…yWithLogits.java to nn.raw,

added new versions of these to NnOps
…x JavaDoc. Change from snake case to camel case.
…x JavaDoc. Change from snake case to camel case.
…Java files for some Ops. This also resulted in new generated source that are also committed.
…gmoidCrossEntropyWithLogits.java, and SparseSoftmaxCrossEntropyWithLogits.java under package org.tensorflow.op.nn in
…asy inclusion of a default optimizer. Cleaned up JavaDoc
Copy link
Collaborator

@Craigacp Craigacp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The seeds should be primitives not boxed primitives that'll let you elide the conditional logic. Either that or document that if the seed is null it defaults to zero (but I'd prefer them to be primitives). Other than that and a missing bit of Javadoc in VarianceScaling this looks good and is ready to be merged.

for (; i < dimsShape.numDimensions() - 1; i++) num_rows *= dimsShape.size(i);
long num_cols = dimsShape.size(i);
Shape flat_shape = Shape.of(Math.max(num_rows, num_cols), Math.min(num_rows, num_cols));
long lseed = this.seed == null ? 0L : this.seed;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't necessary if we make seed be a long not a Long. I think we should prevent people from passing nulls to the constructor.

Craigacp
Craigacp previously approved these changes Oct 2, 2020
Copy link
Collaborator

@Craigacp Craigacp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Jim.

}
boolean isSquare = shape.size(0) == shape.size(1);
long diag_size = Math.min(shape.size(0), shape.size(1));
Shape diagShape = Shape.of(diag_size);
Copy link
Collaborator

@karllessard karllessard Oct 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still a few variables like this in the PR using snake_case instead of camelCase.

Copy link
Collaborator

@karllessard karllessard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we all agree that in general, this PR is ready to be merged. Up to you @JimClarke5 to update now the small details we've mentioned or do it later (I like though the idea of moving isCompatibleWith to the ndarray library).

@karllessard
Copy link
Collaborator

karllessard commented Oct 2, 2020

Another thing, I have some concerns about the naming of some classes, like Constant, which can conflict with the Constant class in the core API, forcing the user to use their canonical names in case both types are present in some files.

I understand though that if we prefix/suffix this class, we should do it for all of our initializers as well (and what about our current optimizers?). So maybe we should just find another name for these?

Note that we are never protected from conflicts with the ops classes since most of them are generated. For example one day, a Ones kernel might appear as well...

@JimClarke5
Copy link
Contributor Author

I have fixed the snake_case issues.

The names are the same ones used in Keras. IMO, most users will use tf.constant() and not the ops Constant class directly.

@Craigacp
Copy link
Collaborator

Craigacp commented Oct 2, 2020

I have fixed the snake_case issues.

The names are the same ones used in Keras. IMO, most users will use tf.constant() and not the ops Constant class directly.

Yeah, but if they assign tf.constant() to a variable they'll need to bring the Constant type into scope unless they are using Java 10+ and use var. So it'll conflict.

@JimClarke5
Copy link
Contributor Author

JimClarke5 commented Oct 2, 2020

It's besides the point but Shape already conflicts with org/tensorflow.op.core.Shape and org.tensorflow.ndarray.Shape.

If you want, I can rename them all to xxxxxInitialzier.

@JimClarke5
Copy link
Contributor Author

I have moved isCompatibleWith to ndarray Shape, and added test for it in ShapeTest.

@karllessard
Copy link
Collaborator

I have fixed the snake_case issues.
The names are the same ones used in Keras. IMO, most users will use tf.constant() and not the ops Constant class directly.

Yeah, but if they assign tf.constant() to a variable they'll need to bring the Constant type into scope unless they are using Java 10+ and use var. So it'll conflict.

Exactly the case I had in mind as well. They can just keep a reference to Operand and it will be fine but I personally like to preserve the original type of the ops as I add it to the graph. But like you said, in Java 10+ and Kotlin it will be less frequent. So let’s keep the current naming strategy for now and review it later if needed.

@karllessard
Copy link
Collaborator

karllessard commented Oct 3, 2020

So this PR is good to go, thanks @JimClarke5 ! Please review the conflict raised in Shape and I’ll merge it after the release of 0.2.0 (which I plan to do this weekend)

@JimClarke5
Copy link
Contributor Author

The conflict in Shape had to do with JavaDoc indenting. As I am using the Google Java formatter in InteliliJ, I resolved he conflict with my version.

Did we want to rename Constant to ConstantInitializer, Zeros to ZerosInitializer and Ones to OnesInitializer ?
I noticed that in TF Python, 2.3.1, there are now initializer methods defined in init_ops_v2.py directly under the tensorflow/python/ops directory (outside of Keras), called tf.constant_initializer, presumably to avoid the conflict with tf.constant, similar to our predicament. They also have tf.ones_initializer, tf.zeros_initializer, among others. They use the same Class names as we currently have, but the method aliases are all in the form tf.xxxx_initializer.

@karllessard
Copy link
Collaborator

If we do it for the initializers, we should do it for the optimizers as well for consistency. That is why I think it can be out of scope of this PR to make that change and we can do it later, wdyt?

@JimClarke5
Copy link
Contributor Author

I vote we leave it as is for now. We still have losses and metrics to go, so by then we can get a holistic view how to handle this issue.

@karllessard
Copy link
Collaborator

@JimClarke5 , I’m sorry to tell you this but it looks like your PR has new conflicts since I’ve updated the master branch for the next development iteration. Can you please fix them so we can merge it?

@karllessard karllessard merged commit a85bcfb into tensorflow:master Oct 8, 2020
@karllessard
Copy link
Collaborator

Ok that’s fine, it just had trouble to rebase but a simple merge did the trick, thanks for your great work!

@JimClarke5
Copy link
Contributor Author

JimClarke5 commented Oct 8, 2020

So I don’t have to fix anything, right?

@karllessard
Copy link
Collaborator

Nope, it’s already merged! I had to do a few quick fixes in the javadoc though to be able to deploy this new snapshot, we should enable these lint checks locally so a developer can catch the errors upfront, there is a story about it: #7

@JimClarke5 JimClarke5 deleted the Initializers1 branch October 8, 2020 22:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants