Skip to content

Add new benchmarks to test\Microsoft.ML.Benchmarks #722

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 24, 2018

Conversation

briancylui
Copy link
Contributor

@briancylui briancylui commented Aug 23, 2018

Submitting @yaeldekel's new benchmarks via this PR.

  • Added a new benchmark for KMeans and Logistic Regression (LR) under test\Microsoft.ML.Benchmarks
  • Added a new sentiment test inside the existing SDCA benchmark

cc: @yaeldekel @eerhardt

@@ -8,6 +8,7 @@
using Microsoft.ML.Runtime.Data;
using Microsoft.ML.Runtime.FastTree;
using Microsoft.ML.Runtime.Internal.Calibration;
using Microsoft.ML.Runtime.Learners;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this file can be reverted. No need to change this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this file are mostly refactoring that improves the style. There are actually multiple redundant usings here that I am going to remove in my next commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, this kind of style/clean up changes should not happen as part of another PR. It distracts from the real change in this PR, and it also clutters history (When some one looks at all the changes to this file, they see this change. Or when someone looks at this commit in the history, there is this needless change also included.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will revert the changes - thanks!

public void Setup()
{
s_dataPath = Program.GetDataPath("adult.train");
StochasticDualCoordinateAscentClassifierBench.s_metrics = Models.ClassificationMetrics.Empty;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like unfortunate coupling that I don't think we want. Can this be removed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1


In reply to: 212427929 [](ancestors = 212427929)

}
}

public class IrisData
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe IrisData and IrisPrediction are used. Can they be removed?

{
// Pipeline
var loader = new TextLoader(env,
new TextLoader.Arguments()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) this should be indented since it is a continuation of the line above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I also indented the subsequent lines since they are actually all inside one pair of parantheses, which may not be apparent from the current code style.

@@ -15,6 +15,7 @@ namespace Microsoft.ML.Models
/// </summary>
public sealed class ClassificationMetrics
{
public static ClassificationMetrics Empty = new ClassificationMetrics();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be adding this public API in this PR.

Copy link
Contributor

@Zruty0 Zruty0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @briancylui

@eerhardt
Copy link
Member

test OSX10.13 Debug
test OSX10.13 Release

@briancylui
Copy link
Contributor Author

@eerhardt: dotnet build -c Release is fine, but dotnet run -c Release throws an exception. Deleting the two lines about s_metrics makes the GetValue(…) method in ClassificationMetricColumn not well-defined.

@briancylui
Copy link
Contributor Author

Copying a relevant comment from #724:

Regarding the perf results posted on #724, KMeansAndLogisticRegression (KMeans+LR) shares the same AccuracyMacro with SDCA (0.98), but since the GetValue method of Program.cs:ClassificationMetricsColumn only references StochasticDualCoordinateAscentClassifierBench.s_metrics (link), which is irrelevant to KMeans+LR, it might be possible that KMeans+LR displayed SDCA's AccuracyMacro as its own metric in the perf results. When I ran dotnet run -c Release and chose KMeans+LR only, the AccuracyMacro was displayed as 0. It may have something to do with the added public variable ClassificationMetrics Empty.

@eerhardt
Copy link
Member

@briancylui - do you think there are changes from #724 that should be brought over here?

@adamsitnik - I see the existing benchmark test has a bit of coupling between the Program.cs:ClassificationMetricsColumn and the StochasticDualCoordinateAscentClassifierBench test. Is it possible to have metrics columns that are specific to a set of tests? For example, the current test is a classification test, and those metrics only apply to that test. The new tests being added should have different metrics.

@Zruty0 Zruty0 merged commit 4fd8a9c into dotnet:master Aug 24, 2018
@adamsitnik
Copy link
Member

@briancylui @eerhardt I have solved this issue in #735

@ghost ghost locked as resolved and limited conversation to collaborators Mar 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants