-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Added a test showing example of text classification using TensorFlow in ML.Net #2302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
47b757a
8415c9a
3e4bbcd
2d83c15
8397ac0
0eb434b
fdc0868
daa4333
0cc516e
ddbd9da
57e730c
18f5f78
f984e0f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
090706417EC29D91EEEABC5C25576374A86426CF25F27556C0EED4FD815D814C4F09FA7389ED8F614E4B34BF6438B9AE0ADA402BEA7CC9441446AB783A6F187D |
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
5359609DDF69D66474F720D6A1ED669942FEB6842096CFC3EAF44B84FA3F2F659829778446BD3C7C83871F7293CA481AC4732DF6DC7921ADA100B459E37198BD |
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
49DB72CDD8D10B78BB1CD17A058DF508E04B38BD287FF53EB9173A48D3994E11741B1EE6C9108303739819845F2F9D777EE3E767D737C24DB3A28B67FF68C951 |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,8 +11,10 @@ | |
using Microsoft.ML.ImageAnalytics; | ||
using Microsoft.ML.RunTests; | ||
using Microsoft.ML.Transforms; | ||
using Microsoft.ML.Transforms.Conversions; | ||
using Microsoft.ML.Transforms.Normalizers; | ||
using Microsoft.ML.Transforms.TensorFlow; | ||
using Microsoft.ML.Transforms.Text; | ||
using Xunit; | ||
|
||
namespace Microsoft.ML.Scenarios | ||
|
@@ -846,5 +848,59 @@ public void TensorFlowTransformCifarInvalidShape() | |
} | ||
Assert.True(thrown); | ||
} | ||
|
||
/// <summary> | ||
/// Class to hold features and predictions. | ||
/// </summary> | ||
public class TensorFlowSentiment | ||
{ | ||
public string Sentiment_Text; | ||
[VectorType(600)] | ||
public int[] Features; | ||
[VectorType(2)] | ||
public float[] Prediction; | ||
} | ||
|
||
[ConditionalFact(typeof(Environment), nameof(Environment.Is64BitProcess))] | ||
public void TensorFlowSentimentClassificationTest() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The test is going to fail as Microsoft.ML.TensorFlow.TestModels nuget is not updated yet. #Resolved |
||
{ | ||
var mlContext = new MLContext(seed: 1, conc: 1); | ||
var data = new[] { new TensorFlowSentiment() { Sentiment_Text = "this film was just brilliant casting location scenery story direction everyone's really suited the part they played and you could just imagine being there robert is an amazing actor and now the same being director father came from the same scottish island as myself so i loved the fact there was a real connection with this film the witty remarks throughout the film were great it was just brilliant so much that i bought the film as soon as it was released for and would recommend it to everyone to watch and the fly fishing was amazing really cried at the end it was so sad and you know what they say if you cry at a film it must have been good and this definitely was also to the two little boy's that played the of norman and paul they were just brilliant children are often left out of the list i think because the stars that play them all grown up are such a big profile for the whole film but these children are amazing and should be praised for what they have done don't you think the whole story was so lovely because it was true and was someone's life after all that was shared with us all" } }; | ||
var dataView = mlContext.Data.ReadFromEnumerable(data); | ||
|
||
var lookupMap = mlContext.Data.ReadFromTextFile(@"sentiment_model/imdb_word_index.csv", | ||
columns: new[] | ||
{ | ||
new TextLoader.Column("Words", DataKind.TX, 0), | ||
new TextLoader.Column("Ids", DataKind.I4, 1), | ||
}, | ||
separatorChar: ',' | ||
); | ||
|
||
// We cannot resize variable length vector to fixed length vector in ML.NET | ||
// The trick here is to create two pipelines. | ||
// The first pipeline 'dataPipe' tokenzies the string into words and maps each word to an integer which is an index in the dictionary. | ||
// Then this integer vector is retrieved from the pipeline and resized to fixed length. | ||
// The second pipeline 'tfEnginePipe' takes the resized integer vector and passes it to TensoFlow and gets the classification scores. | ||
var estimator = mlContext.Transforms.Text.TokenizeWords("TokenizedWords", "Sentiment_Text") | ||
.Append(mlContext.Transforms.Conversion.ValueMap(lookupMap, "Words", "Ids", new[] { ("Features", "TokenizedWords") })); | ||
var dataPipe = estimator.Fit(dataView) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
is there a particular reason why we have There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
.CreatePredictionEngine<TensorFlowSentiment, TensorFlowSentiment>(mlContext); | ||
|
||
// For explanation on how was the `sentiment_model` created | ||
// c.f. https://github.com/dotnet/machinelearning-testdata/blob/master/Microsoft.ML.TensorFlow.TestModels/sentiment_model/README.md | ||
string modelLocation = @"sentiment_model"; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
so this TF model takes as input a vector of floats. Am i right ? Perhaps we should add a comment how the model was created etc. #Resolved There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
var tfEnginePipe = mlContext.Transforms.ScoreTensorFlowModel(modelLocation, new[] { "Prediction/Softmax" }, new[] { "Features" }) | ||
.Append(mlContext.Transforms.CopyColumns(("Prediction", "Prediction/Softmax"))) | ||
.Fit(dataView) | ||
.CreatePredictionEngine<TensorFlowSentiment, TensorFlowSentiment>(mlContext); | ||
|
||
var processedData = dataPipe.Predict(data[0]); | ||
Array.Resize(ref processedData.Features, 600); | ||
var prediction = tfEnginePipe.Predict(processedData); | ||
|
||
Assert.Equal(2, prediction.Prediction.Length); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we verify that the predictions were somewhat correct? #Resolved |
||
Assert.InRange(prediction.Prediction[1], 0.650032759 - 0.01, 0.650032759 + 0.01); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please use Assert.Equal. If there are only two prediction values, can we check them all? #Resolved There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What do you mean by Assert.Equal? Here we are checking the range within particular threshold. In reply to: 252005383 [](ancestors = 252005383) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok, I got you what you meant with Assert.Equal. I actually want to check if my values are in range e.g. 0.64 <= prediction <= 0.66 which I cannot do with Assert.Equal, can I? In reply to: 252006645 [](ancestors = 252006645,252005383) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You have tolerance in Assert.Equal. #Resolved There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It uses number of decimal places which is not applicable here. In reply to: 252050957 [](ancestors = 252050957) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could trim 0.650032759 to 0.65, if we're comparing as ± 0.01. |
||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add
a value x in the input would be mapped to value stored in dictionary[x]
? #Resolved