Skip to content

[Image Classification API] Schema mismatch when loading images using file paths #4319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
luisquintanilla opened this issue Oct 9, 2019 · 7 comments
Assignees

Comments

@luisquintanilla
Copy link
Contributor

luisquintanilla commented Oct 9, 2019

System information

  • OS version/distro: Windows 10
  • .NET Version (eg., dotnet --info): .NET Core 2.2

Issue

  • What did you do?

Updated from Microsoft.ML 1.4.0-preview to 1.4.0-preview2. Using code that worked, I ran into an issue when loading images.

  • What happened?

An ArgumentOutOfRangeException was thrown due to a schema mismatch.

System.ArgumentOutOfRangeException: 'Schema mismatch for input column 'ImagePath': expected Vector<Byte>, got String
Parameter name: inputSchema'
  • What did you expect?

The model to train.

Source code / logs

Repo with source code : https://github.com/luisquintanilla/DeppLearning_ImageClassification_API

This repo uses 1.4.0-preview and works in training a model. If the same code is used with 1.4.0-preview2, the error mentioned in this issue occurs.

@luisquintanilla
Copy link
Contributor Author

luisquintanilla commented Oct 9, 2019

In order for the model to train using 1.4.0-preview2, see this version of the code.

https://github.com/luisquintanilla/DeppLearning_ImageClassification_API/tree/version/1.4.0-preview2

It appears now only in-memory images are supported.

Another issue that comes up when using in-memory images is the metricsCallback. Since now the feature column containing the in-memory image is byte[], it does not output the ImageName. This is a sample of the output.

Phase: Bottleneck Computation, Dataset used:      Train, Image Index:   1, Image Name:
Phase: Bottleneck Computation, Dataset used:      Train, Image Index:   2, Image Name:
Phase: Bottleneck Computation, Dataset used:      Train, Image Index:   3, Image Name:
Phase: Bottleneck Computation, Dataset used:      Train, Image Index:   4, Image Name:
Phase: Bottleneck Computation, Dataset used:      Train, Image Index:   5, Image Name:
Phase: Bottleneck Computation, Dataset used:      Train, Image Index:   6, Image Name:
Phase: Bottleneck Computation, Dataset used:      Train, Image Index:   7, Image Name:
Phase: Bottleneck Computation, Dataset used:      Train, Image Index:   8, Image Name:

@luisquintanilla luisquintanilla changed the title [Image Classification API] Schema mismatch when loading files using file paths [Image Classification API] Schema mismatch when loading images using file paths Oct 9, 2019
@codemzs
Copy link
Member

codemzs commented Oct 9, 2019

With in-memory images there is no concept of "file path" or "image name", you can just modify the callback to not display Image Name. Please refer to our samples on how to pass in-memory images as input during training.

@codemzs codemzs closed this as completed Oct 9, 2019
@codemzs codemzs self-assigned this Oct 9, 2019
@CESARDELATORRE
Copy link
Contributor

@codemzs - But I think the main issue here happens when NOT using in-memory images but training with image paths only, right Luis?

@codemzs
Copy link
Member

codemzs commented Oct 9, 2019

Please refer to my sample first it shows how to use image paths when training

@luisquintanilla
Copy link
Contributor Author

After looking into it, you always have to get the files into byte[] or ImageType format for training. This is more flexible since the input format is the same regardless of whether you're loading from a string file path or from an in-memory object. However, when working with file paths you have to go through that extra step of getting the bytes for the image. I suggested listing this as a breaking change in #4315.

@codemzs
Copy link
Member

codemzs commented Oct 9, 2019

This is not a breaking change as the API is in preview. Second, this is no different from any other trainer where you need to feed features as float array and to do that you need to apply zero or more steps (transformations) to the input.

@luisquintanilla
Copy link
Contributor Author

Okay. Thanks.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants