Skip to content

Support for custom metrics reported in the Benchmarks #735

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Aug 30, 2018

Conversation

adamsitnik
Copy link
Member

This PR enables two things:

  1. executing every benchmark in an isolated process
  2. reporting custom metrics per benchmark

Why should we run every benchmark in a separate process?

  1. Because most of ML.NET benchmarks allocate a lot of memory which affect GC Generation sizes and affects final results (GC is self-tuning if we run all the benchmarks in the same process GC won't be able to find a solution that is great for all of the benchmarks)
  2. Most of the ML.NET can have potential side effects. Example: running train benchmark after running predict benchmark in the same process can possibly affect the results. With new process per benchmark, we always start at the same place and have repeatable results.

Results when running all the benchmarks in the same process:

Type Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
KMeansAndLogisticRegressionBench TrainKMeansAndLR 2,134.265 ms 164.3370 ms 189.2507 ms 16000.0000 9000.0000 3000.0000 49949.23 KB
StochasticDualCoordinateAscentClassifierBench TrainSentiment 2,130.503 ms 24.8173 ms 23.2141 ms 122000.0000 35000.0000 5000.0000 759772.8 KB
StochasticDualCoordinateAscentClassifierBench TrainIris 834.229 ms 254.5284 ms 293.1152 ms 6000.0000 1000.0000 - 12173.28 KB
StochasticDualCoordinateAscentClassifierBench PredictIris 2.472 ms 0.1202 ms 0.1384 ms 35.1563 15.6250 3.9063 123.24 KB
StochasticDualCoordinateAscentClassifierBench PredictIrisBatchOf1 2.712 ms 0.3276 ms 0.3773 ms 35.1563 15.6250 3.9063 123.2 KB
StochasticDualCoordinateAscentClassifierBench PredictIrisBatchOf2 2.370 ms 0.1334 ms 0.1482 ms 35.1563 15.6250 3.9063 123.31 KB
StochasticDualCoordinateAscentClassifierBench PredictIrisBatchOf5 2.492 ms 0.1678 ms 0.1865 ms 35.1563 15.6250 3.9063 123.61 KB

When running every benchmark in a dedicated process:

Type Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
KMeansAndLogisticRegressionBench TrainKMeansAndLR 1,968.326 ms 84.3827 ms 97.1753 ms 16000.0000 9000.0000 3000.0000 50027.36 KB
StochasticDualCoordinateAscentClassifierBench TrainIris 604.496 ms 238.4849 ms 274.6396 ms 59000.0000 1000.0000 - 76697.5 KB
StochasticDualCoordinateAscentClassifierBench TrainSentiment 1,829.670 ms 10.9792 ms 10.2699 ms 123000.0000 35000.0000 6000.0000 759758.03 KB
StochasticDualCoordinateAscentClassifierBench PredictIris 1.895 ms 0.0132 ms 0.0111 ms 35.1563 15.6250 3.9063 121.87 KB
StochasticDualCoordinateAscentClassifierBench PredictIrisBatchOf1 1.941 ms 0.0145 ms 0.0121 ms 35.1563 15.6250 3.9063 119.94 KB
StochasticDualCoordinateAscentClassifierBench PredictIrisBatchOf2 1.960 ms 0.0676 ms 0.0751 ms 35.1563 15.6250 3.9063 121.94 KB
StochasticDualCoordinateAscentClassifierBench PredictIrisBatchOf5 1.870 ms 0.0043 ms 0.0036 ms 37.1094 17.5781 3.9063 120.35 KB

To run every benchmark in a standalone, dedicated process BenchmarkDotNet needs to be able to create, build and run new executable.

So far it was not possible out of the box due to MSBuild limitation. When Microsoft.ML.Benchmarks references native assembly and the auto-generated BenchmarkDotNet project references Microsoft.ML.Benchmarks the native dependencies are not copied to the output folder of the auto-generated project with benchmarks. This is why I had to implement ProjectGenerator which does that for us.

@eerhardt we had a conversation about making it possible for BenchmarkDotNet to compile ML.NET stuff a long time ago and the blocker was the native dependency.

The other thing are custom metrics. BenchmarkDotNet does not support it out of the box, I had to implement it. How it works:

  1. If given type wants to report custom metrics it has to derive from WithExtraMetrics and implement IEnumerable<Metric> GetMetrics() method
  2. WithExtraMetrics after running the benchmarks prints the custom metrics to console in child process
  3. ExtraMetricColumn parses the output in parent process.

Sample results:

Type Method Extra Metric
KMeansAndLogisticRegressionBench TrainKMeansAndLR -
StochasticDualCoordinateAscentClassifierBench TrainIris -
StochasticDualCoordinateAscentClassifierBench TrainSentiment -
StochasticDualCoordinateAscentClassifierBench PredictIris AccuracyMacro: 0.98
StochasticDualCoordinateAscentClassifierBench PredictIrisBatchOf1 AccuracyMacro: 0.98
StochasticDualCoordinateAscentClassifierBench PredictIrisBatchOf2 AccuracyMacro: 0.98
StochasticDualCoordinateAscentClassifierBench PredictIrisBatchOf5 AccuracyMacro: 0.98

Other changes: so far the benchmarks were using currentAssemblyLocation.Directory.Parent.Parent.Parent.Parent.FullName to get the path to folder with input files. I believe it's better to reference them as links in csproj and "copy to output directory if newer". This solution is cleaner and more futureproof.

/cc @eerhardt @danmosemsft @briancylui @KrzysztofCwalina

@shauheen
Copy link
Contributor

Thanks @adamsitnik , can you please associate this with the relevant issue?

public int PriorityInCategory => 1;
public UnitType UnitType => UnitType.Dimensionless;
// enforce Neutral Language as "en-us" because the input data files use dot as decimal separator (and it fails for cultures with ",")
Thread.CurrentThread.CurrentCulture = CultureInfo.InvariantCulture;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line of code is a bit surprising in a method that is supposed to return a data path. Maybe it would be better to do this in the Main method, or a GlobalSetup method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eerhardt I agree that I am breaking CQRS here. My only excuse is that I have named the method GetInvariantCultureDataPath so people can expect that.

I was thinking about moving it to a [GlobalSetup] method but I am afraid that people will don't follow this pattern in new benchmarks. By having it here I guarantee that whoever is going to use files will be using CultureInfo.InvariantCulture for reading these files.

I also wonder how ML.NET samples deal with the culture info problem. Does anybody know?

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

[GlobalCleanup]
public void ReportMetrics()
{
foreach (var metric in GetMetrics())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Would it improve perf to set var metrics = GetMetrics(); right before the foreach loop and then write the condition as var metric in metrics? Not sure...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@briancylui no, it would not.

Whenever you are not sure about something you can benchmark it with BenchmarkDotNet ;)

var foldeWithAutogeneratedExe = Path.GetDirectoryName(artifactsPaths.ExecutablePath);
var folderWithNativeDependencies = Path.GetDirectoryName(typeof(ProjectGenerator).Assembly.Location);

foreach(var nativeDependency in Directory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing space between foreach and the succeeding (

for (int bi = 0; bi < batch.Length; bi++)
{
batch[bi] = _example;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this for loop change elements of _batches[i] or only elements of the local variable batch? Not an expert so not sure whether batch is a ref-type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is done on purpose, it's a Setup method

Copy link
Contributor

@briancylui briancylui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Feel free to merge after responding to PR comments - I don't have write access so can't hit merge unfortunately. Not an expert in Benchmark.NET, but this PR looks good to me! Thanks @adamsitnik

@briancylui
Copy link
Contributor

More reviewers are needed for this PR to be merged - my review doesn't count towards mergeability since I don't have write access.

@adamsitnik
Copy link
Member Author

@briancylui @eerhardt thank you for your reviews! I don't have write access myself, so who could merge it?

@shauheen there is no issue, but there was an email thread. Do you want me to create an issue for that?

@eerhardt
Copy link
Member

test OSX10.13 Debug

@briancylui
Copy link
Contributor

test OSX10.13 Debug please
test public-CI please

# Conflicts:
#	build/Dependencies.props
#	test/Microsoft.ML.Benchmarks/KMeansAndLogisticRegressionBench.cs
@safern safern merged commit dfe9f3a into dotnet:master Aug 30, 2018
@ghost ghost locked as resolved and limited conversation to collaborators Mar 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants