-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Handle inputs with unknown shapes in TensorFlow #857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -191,6 +191,8 @@ private TensorFlowTransform(IHostEnvironment env, byte[] modelBytes, string[] in | |||
Contracts.CheckValue(env, nameof(env)); | |||
_host = env.Register(nameof(RegistrationName)); | |||
_host.CheckValue(modelBytes, nameof(modelBytes)); | |||
_host.CheckNonEmpty(inputs, nameof(inputs)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inputs [](start = 47, length = 6)
outputs as well? #Resolved
resultDic[Transformer.Outputs[i]] = new SchemaShape.Column(Transformer.Outputs[i], SchemaShape.Column.VectorKind.Vector, Transformer.OutputTypes[i].ItemType, false); | ||
{ | ||
resultDic[Transformer.Outputs[i]] = new SchemaShape.Column(Transformer.Outputs[i], | ||
Transformer.OutputTypes[i].VectorSize > 0 ? SchemaShape.Column.VectorKind.Vector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.VectorSize > 0 [](start = 46, length = 15)
can we use IsKnownSizeVector? #Resolved
@@ -311,26 +319,56 @@ public Mapper(IHostEnvironment env, TensorFlowTransform parent, ISchema inputSch | |||
_schema = inputSchema; | |||
_inputColIndices = new int[_parent.Inputs.Length]; | |||
_isInputVector = new bool[_parent.Inputs.Length]; | |||
_fullySpecifiedShapes = new TFShape[_parent.Inputs.Length]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_fullySpecifiedShapes [](start = 16, length = 21)
I feel like this one should be part of Transformer, rather than mapper.
You do estimator.Fit(somedata) and gain transformer, and it resolves it's variable lengths[?,?,3] as [3,3,3].
Not sure it would be right to accept data rather than [3,3,3] to transformer after that.
(Same probably states for _isInputVector, not sure why I didn't put it to Transformer)
@Zruty0 to make sure I correctly understand estimator/transformer business.
#Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that when you instantiate the estimator you only have the TF model and not the IDV, so we don't necessarily know all the dimensions in the shape. However, in order to convert a VBuffer to a Tensor we need to know the fully specified shape.
Instead of having this field, I could instantiate a new shape object on every getter call, using _inputColIndices and _schema to figure out the input size. Do you think this is a good solution?
In reply to: 216115594 [](ancestors = 216115594)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I just have incorrect understanding of estimator/transformer, and we don't have to have same schema for input fitting and input transforming.
In reply to: 216382814 [](ancestors = 216382814,216115594)
@@ -233,7 +240,7 @@ private TensorFlowTransform(IHostEnvironment env, byte[] modelBytes, string[] in | |||
{ | |||
var tfOutput = new TFOutput(Graph[Outputs[i]]); | |||
var shape = Graph.GetTensorShape(tfOutput); | |||
int[] dims = shape.ToIntArray().Skip(shape[0] == -1 ? BatchSize : 0).ToArray(); | |||
int[] dims = shape.NumDimensions > 0 ? shape.ToIntArray().Skip(shape[0] == -1 ? BatchSize : 0).ToArray() : new[] { 0 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0 [](start = 131, length = 1)
Does this zero mean variable length? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getNum(ref buffer); | ||
getClasses(ref buffer); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to validate anything here ? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The images I am using here don't have any detections, so all of the outputs will be all 0. Once I get the models uploaded, I will also upload some images that have detections, then I can add validation of the outputs.
In reply to: 216457130 [](ancestors = 216457130)
{ | ||
ModelFile = model_location, | ||
OutputColumns = new[] { "Softmax", "dense/Relu" }, | ||
InputColumns = new[] { "Placeholder", "reshape_input" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reshape_input [](start = 55, length = 13)
this is specified as an input, but I do not see it when passing in the data. Is this required ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is required, this column is created in line 273 by the CopyColumns transform.
This input is required for computing the "dense/Relu" output. Actually, I am not sure why it is needed, since if I understand correctly, reshape_input is computed by the input layer Placeholder by simply reshaping from 28x28 to 784. @zeahmed , do you know why "Placeholder" is not enough to compute "dense/Relu"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the problem with the model. https://github.com/tensorflow/models/blob/master/official/mnist/mnist.py
If you want to access the features
(named dense/Relu
) only then reshape_input
is required if you want to access the Softmax
then Placeholder
is required. This is the problem with model. If you closely look at the model graph attached. You would observe two graph that are working in parallel. Having said that I think its a good model to test two inputs and two outputs.
If you think if its going to make an issue I can update the model.
In reply to: 216485826 [](ancestors = 216485826)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, i think we should rename the nodes properly to avoid strange names like dense/Relu
etc....:)
In reply to: 216502475 [](ancestors = 216502475,216485826)
for (int j = 0; j < TFInputShapes[i].NumDimensions; j++) | ||
newShape[j] = TFInputShapes[i][j] == -1 ? BatchSize : TFInputShapes[i][j]; | ||
TFInputShapes[i] = new TFShape(newShape); | ||
if (TFInputShapes[i].NumDimensions != -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (TFInputShapes[i].NumDimensions != -1) [](start = 16, length = 41)
am curious - why did we need this check ?
Did some of the pre-trained models have TFInputShapes[i].NumDimensions == -1 (that we were not handling before)
#Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the shape is completely unknown, then its NumDimensions property is -1.
In reply to: 216460739 [](ancestors = 216460739)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense. in another comment i was asking if TF has this documented somewhere -- or we are using this as a heuristic based on models we have played with so far ?
In reply to: 216475056 [](ancestors = 216475056,216460739)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't seen it documented, I just saw it by debugging different models.
In reply to: 216476239 [](ancestors = 216476239,216475056,216460739)
if (TFInputShapes[i].NumDimensions != -1) | ||
{ | ||
var newShape = new long[TFInputShapes[i].NumDimensions]; | ||
newShape[0] = TFInputShapes[i][0] == -1 ? BatchSize : TFInputShapes[i][0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newShape[0] = TFInputShapes[i][0] == -1 ? BatchSize : TFInputShapes[i][0]; [](start = 20, length = 74)
so we will have special handling only for the 1st dimension, and not for the other dimensions -- is that the intent ?
(looks like we should have been doing this previously too, instead of doing special handling for all the columns) #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This is because when the first dimension is -1 it indicates that the first dimension is the batch size. For any other dimension, it just means the dimension can be anything. For example, if the dimension is [?,?,?,3], then the first ? is for the batch size, and the other two are for the width and height of the image. In this case we don't want to change this to 1, we want to keep it as -1, so that we still need to fill in this value when we see the actual example and know its size.
In reply to: 216461730 [](ancestors = 216461730)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -233,7 +241,7 @@ private TensorFlowTransform(IHostEnvironment env, byte[] modelBytes, string[] in | |||
{ | |||
var tfOutput = new TFOutput(Graph[Outputs[i]]); | |||
var shape = Graph.GetTensorShape(tfOutput); | |||
int[] dims = shape.ToIntArray().Skip(shape[0] == -1 ? BatchSize : 0).ToArray(); | |||
int[] dims = shape.NumDimensions > 0 ? shape.ToIntArray().Skip(shape[0] == -1 ? BatchSize : 0).ToArray() : new[] { 0 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shape.NumDimensions > 0 [](start = 29, length = 23)
I am presuming we are using this here as a check for models producing variable length outputs...Am i right ?
Does TF document such behaviour somewhere ? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the shape is unknown, it has shape.NumDimensions == -1, and shape.ToIntArray() == null, which would cause a null reference exception when we try to access shape[0].
In reply to: 216462762 [](ancestors = 216462762)
@@ -37,13 +37,14 @@ namespace Microsoft.ML.Transforms.TensorFlow | |||
/// </summary> | |||
/// <typeparam name="T[]">.NET type of tensor to create</typeparam> | |||
/// <param name="data">value of tensor</param> | |||
/// <param name="count">The number of elements in the tensor</param> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did we have to re-generate the TensorGeneric.cs file after making these changes ? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Apparently TensorGeneric.tt regenerates TensorGeneric.cs automatically whenever it is saved.
In reply to: 216466572 [](ancestors = 216466572)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for (int i = 0; i < _parent.Outputs.Length; i++) | ||
{ | ||
if (activeOutput(i)) | ||
{ | ||
var type = TFTensor.TypeFromTensorType(_parent.TFOutputTypes[i]); | ||
_host.Assert(type == _parent.OutputTypes[i].ItemType.RawType); | ||
var srcTensorGetters = GetTensorValueGetters(input); | ||
valueGetters.Add(Utils.MarshalInvoke(MakeGetter<int>, type, input, i, srcTensorGetters, activeOutputColNames, outputCache)); | ||
valueGetters[i] = Utils.MarshalInvoke(MakeGetter<int>, type, input, i, srcTensorGetters, activeOutputColNames, outputCache); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
valueGetters[i] [](start = 24, length = 15)
am curious about this bug in the getter -- could you kindly elaborate a bit on this ? .. is this some artifact of the activateOutput() call above.. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bug surfaced when I added the new unit test. It happens in the following situation:
- Create a TensorFlowTransform that computes two outputs, say A and B.
- Create another transform/learner that only uses B as input.
- When we try to cursor over the data, the activeOutput predicate will say that output 0 is not active and output 1 is active, thus returning an array of length 1.
- When we try to get the getter of column B, we do so using its index, which is 1 (the index of column A is 0 and the index of column B is 1). So we try to access the getters array which is of length 1, at index 1 which is out of bounds...
The fix was to always create an array with length equal to the number of output columns in the transform, but populate just the indices where the active columns are.
In reply to: 216469314 [](ancestors = 216469314)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…o we don't need a separate nuget for it to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{ | ||
return SetupTensor(dt, dims, data, start: 0, count: data.Length, size: size); | ||
return SetupTensor(dt, dims, data, start: 0, count: count, size: size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
start: 0, count: count, size: size [](start = 47, length = 34)
nit: do you need to specify param names here? don't you invoke function with all params already
This PR adds support for unknown shapes in the inputs and in the outputs of TensorFlow transform.
Closes #848 .