Skip to content

System.InvalidOperationException when loading tensorflow model #3689

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
carlosefrias opened this issue May 9, 2019 · 4 comments
Closed

System.InvalidOperationException when loading tensorflow model #3689

carlosefrias opened this issue May 9, 2019 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@carlosefrias
Copy link

System information

  • OS version/distro:
    Windows 10
  • .NET Version (eg., dotnet --info):
    Visual Studio 2017
    .NET Core 2.1
    Microsoft.ML v1.0.0 NuGet package

Issue

System.InvalidOperationException when calling LoadTensorFlowModel function

  • What did you expect?
    Excepted the model to be loaded.

Source code / logs

at Microsoft.ML.Transforms.TensorFlow.TensorFlowUtils.LoadTFSession(IExceptionContext ectx, Byte[] modelBytes, String modelFile) at Microsoft.ML.TensorflowCatalog.LoadTensorFlowModel(ModelOperationsCatalog catalog, String modelLocation) at ImageClassification.Score.ModelScorer.TFModelScorer.LoadModel(String dataLocation, String imagesFolder, String modelLocation) in C:\Users\me1cme\repos\ml.net-learning\samples\csharp\getting-started\DeepLearning_ImageClassification_TensorFlow\ImageClassification\ModelScorer\TFModelScorer.cs:line 67 at ImageClassification.Score.ModelScorer.TFModelScorer.Score() in C:\Users\me1cme\repos\ml.net-learning\samples\csharp\getting-started\DeepLearning_ImageClassification_TensorFlow\ImageClassification\ModelScorer\TFModelScorer.cs:line 50 at ImageClassification.Program.Main() in C:\Users\me1cme\repos\ml.net-learning\samples\csharp\getting-started\DeepLearning_ImageClassification_TensorFlow\ImageClassification\Program.cs:line 27

Message
TensorFlow exception triggered while loading model from '../../../assets/inputs/final.pb'

Please paste or attach the code or logs or traces that would be helpful to diagnose the issue you are reporting.

@glebuk glebuk added bug Something isn't working ❤ Community labels May 9, 2019
@abgoswam
Copy link
Member

Hi @carlosefrias , could you kindly point us to the frozen model, so we can repro this on our end. Also if you could provide the sample data / code that you are using ?

@abgoswam
Copy link
Member

@carlosefrias . am closing this since there was no response. please re-open (with sample data/code/model) if u r still facing this issue.

@baruchiro
Copy link

Hi, I'm using the sample here with this code to create the .pb file:

import tensorflow as tf

f_size = 15 # Number of features passed from ML.Net
num_output = 2 # Number of outputs
tf.set_random_seed(1)
X = tf.placeholder('float', [None, f_size], name="X")
Y = tf.placeholder('float', [None, num_output], name="Y")
lr = tf.placeholder(tf.float32, name = "learning_rate")


# Set model weights
W = tf.Variable(tf.random_normal([f_size,num_output], stddev=0.1), name = 'W')
b = tf.Variable(tf.zeros([num_output]), name = 'b')

l1 = 0
l2 = 0
RegScores = tf.add(tf.matmul(X, W), b, name='RegScores')
loss = tf.reduce_mean(tf.square(Y-tf.squeeze(RegScores))) / 2  + l2 * tf.nn.l2_loss(W) + l1 * tf.reduce_sum(tf.abs(W))
loss = tf.identity(loss, name="Loss")
optimizer = tf.train.MomentumOptimizer(lr, momentum=0.9, name='MomentumOptimizer').minimize(loss)

init = tf.global_variables_initializer()
# Launch the graph.
with tf.Session() as sess:
    sess.run(init)
    tf.saved_model.simple_save(sess, r'NYCTaxi/model', inputs={'X': X, 'Y': Y}, outputs={'RegScores': RegScores} )

And I get the error:

System.InvalidOperationException : TensorFlow exception triggered while loading model from 'Resources/saved_model.pb'

I think the issue is about providing more information when the TF failed, and not about the problem itself.

@TannerGilbert
Copy link

TannerGilbert commented Jul 9, 2019

I have the same issue. My code works when I'm using a pretrained mobilenet but fails when I try to run it with my own model.
{"TensorFlow exception triggered while loading model from 'xyz\\bin\\Debug\\netcoreapp2.1\\../../../assets\\inputs\\model\\model.pb'"}

For training the custom model I'm using

from keras.applications.mobilenet import MobileNet
from keras.preprocessing import image
from keras.models import Model, load_model
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K
from keras.utils import to_categorical
from keras.callbacks import ModelCheckpoint
import tensorflow as tf
import numpy as np
import pandas as pd
from PIL import Image
import argparse


def data_gen(df, num_classes, batch_size=32, input_shape=(224, 224, 3)):
    """ Load in image data"""
    while True:
        idx = np.random.choice(a=np.arange(len(df['ImgPath'])), size=batch_size)
        batch_paths = df['ImgPath'][idx]
        images = []
        for img_path in batch_paths:
            image = Image.open(str(img_path))
            image = image.resize(input_shape[0:2], Image.ANTIALIAS)
            if input_shape[2] == 1:
                image = image.convert('LA')
            image = np.asarray(image)
            images.append(image)
        images = np.array(images)
        images = images.reshape(len(images), input_shape[0], input_shape[1], input_shape[2])
        labels = np.array(df['VG'][idx])
        
        labels = to_categorical(labels, num_classes=num_classes)
        yield (images, labels)


def freeze_session(session, keep_var_names=None, output_names=None, clear_devices=True):
    """
        Freezes the state of a session into a pruned computation graph.

        Creates a new computation graph where variable nodes are replaced by
        constants taking their current value in the session. The new graph will be
        pruned so subgraphs that are not necessary to compute the requested
        outputs are removed.
        @param session The TensorFlow session to be frozen.
        @param keep_var_names A list of variable names that should not be frozen,
                            or None to freeze all the variables in the graph.
        @param output_names Names of the relevant graph outputs.
        @param clear_devices Remove the device directives from the graph for better portability.
        @return The frozen graph definition.
    """
    graph = session.graph
    with graph.as_default():
        freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(keep_var_names or []))
        output_names = output_names or []
        output_names += [v.op.name for v in tf.global_variables()]
        input_graph_def = graph.as_graph_def()
        if clear_devices:
            for node in input_graph_def.node:
                node.device = ''
        frozen_graph = tf.compat.v1.graph_util.convert_variables_to_constants(
            session, input_graph_def, output_names, freeze_var_names)
        return frozen_graph


def create_model(num_classes, compile=True):
    base_model = MobileNet(weights='imagenet', include_top=False)

    x = base_model.output
    x = GlobalAveragePooling2D()(x)

    x = Dense(1024, activation='relu')(x)

    predictions = Dense(num_classes, activation='softmax')(x)

    model = Model(base_model.input, predictions)

    if compile:
        model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

    print(model.summary())

    return model


def get_model(filepath, num_classes):
    try:
        model = load_model(filepath)
        if len(model.predict(np.zeros((1, 224, 224, 3)))[0]) != num_classes:
            print('Replacing output layer')
            output = Dense(num_classes, activation='softmax', name='dense_2')(model.layers[-2].output)
            model = Model(model.input, output)
        model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
        print(model.summary())
        return model
    except Exception as e:
        print(e)
        print('Wrong model path. Creating new model.')
        model = create_model(num_classes)
        return model


def train_model(model, filepath, epochs, batch_size, num_classes, saving_directory, data_quality):
    #Prepare data
    df = pd.read_csv(filepath)
    df.dropna(inplace=True)
    df = df[(df['Q']>data_quality)]
    df.reset_index(drop=True, inplace=True)
    df['VG'] = df['VG'] - 1
    df = df[:50]

    # Training
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        checkpoint = ModelCheckpoint(saving_directory + 'model-{epoch:04d}.h5', monitor='loss', verbose=1, save_best_only=True)
        data = pd.read_csv
        model.fit_generator(data_gen(df, num_classes, batch_size=batch_size, input_shape=(224, 224, 3)), epochs=epochs, steps_per_epoch=(len(df)/batch_size), callbacks=[checkpoint])

        print([out.op.name for out in model.outputs])
        
        frozen_graph = freeze_session(tf.keras.backend.get_session(), output_names=[out.op.name for out in model.outputs])

        tf.train.write_graph(frozen_graph, "./", "model.pb", as_text=False)
  


if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Train Vehicle Classification Network')
    parser.add_argument('-f', '--filename', type=str, required=True, help='Path to data csv file')
    parser.add_argument('-m', '--model_path', type=str, default='', help='Path to model file (h5)')
    parser.add_argument('-e', '--epochs', type=int, default=10, help='Number of epochs')
    parser.add_argument('-b', '--batch_size', type=int, default=32, help='Batch Size')
    parser.add_argument('-sd', '--saving_directory', type=str, default='models/', help='Model saving directory')
    parser.add_argument('-nc', '--num_classes', type=int, default=7, help='Number of classes')
    parser.add_argument('-q', '--data_quality', type=int, default=10, help='Min Q value')
    args = parser.parse_args()
    if args.model_path:
        model = get_model(args.model_path, args.num_classes)
    else:
        model = create_model(args.num_classes)
    train_model(model, args.filename, args.epochs, args.batch_size, args.num_classes, args.saving_directory, args.data_quality)

@ghost ghost locked as resolved and limited conversation to collaborators Mar 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants