System.InvalidOperationException when loading tensorflow model #3689

carlosefrias · 2019-05-09T09:29:12Z

System information

OS version/distro:
Windows 10
.NET Version (eg., dotnet --info):
Visual Studio 2017
.NET Core 2.1
Microsoft.ML v1.0.0 NuGet package

Issue

What did you do?
Based on the example on https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/DeepLearning_ImageClassification_TensorFlow I tryed to load a tensorflow model using for image segmentation into Microsoft.ML
The model it self was created using keras and then converted to tensorflow using an adapted version of the https://github.com/amir-abdi/keras_to_tensorflow
What happened?

System.InvalidOperationException when calling LoadTensorFlowModel function

What did you expect?
Excepted the model to be loaded.

Source code / logs

at Microsoft.ML.Transforms.TensorFlow.TensorFlowUtils.LoadTFSession(IExceptionContext ectx, Byte[] modelBytes, String modelFile) at Microsoft.ML.TensorflowCatalog.LoadTensorFlowModel(ModelOperationsCatalog catalog, String modelLocation) at ImageClassification.Score.ModelScorer.TFModelScorer.LoadModel(String dataLocation, String imagesFolder, String modelLocation) in C:\Users\me1cme\repos\ml.net-learning\samples\csharp\getting-started\DeepLearning_ImageClassification_TensorFlow\ImageClassification\ModelScorer\TFModelScorer.cs:line 67 at ImageClassification.Score.ModelScorer.TFModelScorer.Score() in C:\Users\me1cme\repos\ml.net-learning\samples\csharp\getting-started\DeepLearning_ImageClassification_TensorFlow\ImageClassification\ModelScorer\TFModelScorer.cs:line 50 at ImageClassification.Program.Main() in C:\Users\me1cme\repos\ml.net-learning\samples\csharp\getting-started\DeepLearning_ImageClassification_TensorFlow\ImageClassification\Program.cs:line 27

Message
TensorFlow exception triggered while loading model from '../../../assets/inputs/final.pb'

Please paste or attach the code or logs or traces that would be helpful to diagnose the issue you are reporting.

The text was updated successfully, but these errors were encountered:

abgoswam · 2019-05-13T18:17:57Z

Hi @carlosefrias , could you kindly point us to the frozen model, so we can repro this on our end. Also if you could provide the sample data / code that you are using ?

abgoswam · 2019-05-22T15:47:41Z

@carlosefrias . am closing this since there was no response. please re-open (with sample data/code/model) if u r still facing this issue.

baruchiro · 2019-07-08T15:57:21Z

Hi, I'm using the sample here with this code to create the .pb file:

import tensorflow as tf

f_size = 15 # Number of features passed from ML.Net
num_output = 2 # Number of outputs
tf.set_random_seed(1)
X = tf.placeholder('float', [None, f_size], name="X")
Y = tf.placeholder('float', [None, num_output], name="Y")
lr = tf.placeholder(tf.float32, name = "learning_rate")


# Set model weights
W = tf.Variable(tf.random_normal([f_size,num_output], stddev=0.1), name = 'W')
b = tf.Variable(tf.zeros([num_output]), name = 'b')

l1 = 0
l2 = 0
RegScores = tf.add(tf.matmul(X, W), b, name='RegScores')
loss = tf.reduce_mean(tf.square(Y-tf.squeeze(RegScores))) / 2  + l2 * tf.nn.l2_loss(W) + l1 * tf.reduce_sum(tf.abs(W))
loss = tf.identity(loss, name="Loss")
optimizer = tf.train.MomentumOptimizer(lr, momentum=0.9, name='MomentumOptimizer').minimize(loss)

init = tf.global_variables_initializer()
# Launch the graph.
with tf.Session() as sess:
    sess.run(init)
    tf.saved_model.simple_save(sess, r'NYCTaxi/model', inputs={'X': X, 'Y': Y}, outputs={'RegScores': RegScores} )

And I get the error:

System.InvalidOperationException : TensorFlow exception triggered while loading model from 'Resources/saved_model.pb'

I think the issue is about providing more information when the TF failed, and not about the problem itself.

TannerGilbert · 2019-07-09T11:00:21Z

I have the same issue. My code works when I'm using a pretrained mobilenet but fails when I try to run it with my own model.
{"TensorFlow exception triggered while loading model from 'xyz\\bin\\Debug\\netcoreapp2.1\\../../../assets\\inputs\\model\\model.pb'"}

For training the custom model I'm using

from keras.applications.mobilenet import MobileNet
from keras.preprocessing import image
from keras.models import Model, load_model
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K
from keras.utils import to_categorical
from keras.callbacks import ModelCheckpoint
import tensorflow as tf
import numpy as np
import pandas as pd
from PIL import Image
import argparse


def data_gen(df, num_classes, batch_size=32, input_shape=(224, 224, 3)):
    """ Load in image data"""
    while True:
        idx = np.random.choice(a=np.arange(len(df['ImgPath'])), size=batch_size)
        batch_paths = df['ImgPath'][idx]
        images = []
        for img_path in batch_paths:
            image = Image.open(str(img_path))
            image = image.resize(input_shape[0:2], Image.ANTIALIAS)
            if input_shape[2] == 1:
                image = image.convert('LA')
            image = np.asarray(image)
            images.append(image)
        images = np.array(images)
        images = images.reshape(len(images), input_shape[0], input_shape[1], input_shape[2])
        labels = np.array(df['VG'][idx])
        
        labels = to_categorical(labels, num_classes=num_classes)
        yield (images, labels)


def freeze_session(session, keep_var_names=None, output_names=None, clear_devices=True):
    """
        Freezes the state of a session into a pruned computation graph.

        Creates a new computation graph where variable nodes are replaced by
        constants taking their current value in the session. The new graph will be
        pruned so subgraphs that are not necessary to compute the requested
        outputs are removed.
        @param session The TensorFlow session to be frozen.
        @param keep_var_names A list of variable names that should not be frozen,
                            or None to freeze all the variables in the graph.
        @param output_names Names of the relevant graph outputs.
        @param clear_devices Remove the device directives from the graph for better portability.
        @return The frozen graph definition.
    """
    graph = session.graph
    with graph.as_default():
        freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(keep_var_names or []))
        output_names = output_names or []
        output_names += [v.op.name for v in tf.global_variables()]
        input_graph_def = graph.as_graph_def()
        if clear_devices:
            for node in input_graph_def.node:
                node.device = ''
        frozen_graph = tf.compat.v1.graph_util.convert_variables_to_constants(
            session, input_graph_def, output_names, freeze_var_names)
        return frozen_graph


def create_model(num_classes, compile=True):
    base_model = MobileNet(weights='imagenet', include_top=False)

    x = base_model.output
    x = GlobalAveragePooling2D()(x)

    x = Dense(1024, activation='relu')(x)

    predictions = Dense(num_classes, activation='softmax')(x)

    model = Model(base_model.input, predictions)

    if compile:
        model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

    print(model.summary())

    return model


def get_model(filepath, num_classes):
    try:
        model = load_model(filepath)
        if len(model.predict(np.zeros((1, 224, 224, 3)))[0]) != num_classes:
            print('Replacing output layer')
            output = Dense(num_classes, activation='softmax', name='dense_2')(model.layers[-2].output)
            model = Model(model.input, output)
        model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
        print(model.summary())
        return model
    except Exception as e:
        print(e)
        print('Wrong model path. Creating new model.')
        model = create_model(num_classes)
        return model


def train_model(model, filepath, epochs, batch_size, num_classes, saving_directory, data_quality):
    #Prepare data
    df = pd.read_csv(filepath)
    df.dropna(inplace=True)
    df = df[(df['Q']>data_quality)]
    df.reset_index(drop=True, inplace=True)
    df['VG'] = df['VG'] - 1
    df = df[:50]

    # Training
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        checkpoint = ModelCheckpoint(saving_directory + 'model-{epoch:04d}.h5', monitor='loss', verbose=1, save_best_only=True)
        data = pd.read_csv
        model.fit_generator(data_gen(df, num_classes, batch_size=batch_size, input_shape=(224, 224, 3)), epochs=epochs, steps_per_epoch=(len(df)/batch_size), callbacks=[checkpoint])

        print([out.op.name for out in model.outputs])
        
        frozen_graph = freeze_session(tf.keras.backend.get_session(), output_names=[out.op.name for out in model.outputs])

        tf.train.write_graph(frozen_graph, "./", "model.pb", as_text=False)
  


if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Train Vehicle Classification Network')
    parser.add_argument('-f', '--filename', type=str, required=True, help='Path to data csv file')
    parser.add_argument('-m', '--model_path', type=str, default='', help='Path to model file (h5)')
    parser.add_argument('-e', '--epochs', type=int, default=10, help='Number of epochs')
    parser.add_argument('-b', '--batch_size', type=int, default=32, help='Batch Size')
    parser.add_argument('-sd', '--saving_directory', type=str, default='models/', help='Model saving directory')
    parser.add_argument('-nc', '--num_classes', type=int, default=7, help='Number of classes')
    parser.add_argument('-q', '--data_quality', type=int, default=10, help='Min Q value')
    args = parser.parse_args()
    if args.model_path:
        model = get_model(args.model_path, args.num_classes)
    else:
        model = create_model(args.num_classes)
    train_model(model, args.filename, args.epochs, args.batch_size, args.num_classes, args.saving_directory, args.data_quality)

glebuk added bug Something isn't working ❤ Community labels May 9, 2019

glebuk assigned abgoswam May 13, 2019

abgoswam closed this as completed May 22, 2019

eerhardt mentioned this issue Jul 8, 2019

Change default # of iterations in Averaged Perceptron to 10 #2305

Closed

ghost locked as resolved and limited conversation to collaborators Mar 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

System.InvalidOperationException when loading tensorflow model #3689

System.InvalidOperationException when loading tensorflow model #3689

carlosefrias commented May 9, 2019

abgoswam commented May 13, 2019

abgoswam commented May 22, 2019

baruchiro commented Jul 8, 2019

TannerGilbert commented Jul 9, 2019 •

edited

Loading

System.InvalidOperationException when loading tensorflow model #3689

System.InvalidOperationException when loading tensorflow model #3689

Comments

carlosefrias commented May 9, 2019

System information

Issue

Source code / logs

abgoswam commented May 13, 2019

abgoswam commented May 22, 2019

baruchiro commented Jul 8, 2019

TannerGilbert commented Jul 9, 2019 • edited Loading

TannerGilbert commented Jul 9, 2019 •

edited

Loading