Suggestion for test framework improvements. #410

Ivanidzo4ka · 2018-06-26T00:12:05Z

No reason to have same baselines for Debug and Release if they are same.

Currently we have /test/BaselineOutput folder which contains folders Common, SingleDebug and SingleRelease and they filled with pretty much same files. We need to modify our test framework so it will look first in Common folder and if file is not found then it should look in respectful folders.
Also word Single is legacy artifact and can be omitted.

This would reduce size of repo, and simplify peoples life. Also it give more room to have platform specific tests since many of our tests disabled because we verify everything with windows output, which sometimes differ because of native code implementation.

codemzs · 2018-06-26T00:14:18Z

CC: @TomFinley

TomFinley · 2018-06-26T01:27:18Z

Hi @Ivanidzo4ka , thanks for the proposal.

One thing about the existing system that is quite convenient, that I do not see how to reproduce in your scheme very well, is that if the baseline files could exist in common, or in some target specific directory, then I do not see how one could use a tool like Beyond Compare to actually do a directory diff, and detect the differences in baselines. This is quite useful -- I have and I know you have had situations where due to a seemingly innocuous change, baselines have changed significantly. (E.g., slightly different wording in a log message leads to hundreds of files changing.) That work is simplified enormously if the test output folder mirrors the baseline folder, which as far as I can tell they would not in your proposal.

If you modified your proposal so that tests had to explicitly identify themselves as being either "common" or "platform specific" baselines (probably the latter), then this would retain the "diffability" of the TestOutput and BaselineOutput directories.

Ivanidzo4ka · 2018-06-27T20:32:08Z

So currently we have this structure of test\BaselineOutput and and bin{configuration}{project}{framework}\TestOutput folders. We pull files from same respective locations in both folders and compare them.

My proposal is to touch only test\BaselineOutput folder and in framework if we looking for file, first search in Common folder, then in Configuration specific folder, then in OS specific and then in combination of Configuration and OS (we can get rid of OS specific for now). Everything we do with test output will continue to go to TestOutput folder, and you can run it for different Configurations and they would be inside respective Debug/Release folder. So you can compare them. It just during PR creation owner of PR can decide where he want's to put this files, if baselines are similar, he can put them into Common, if they different he can put them in respective folders.

We also can have test similar AAACompareBaselines which would notify person what files are same, and should be moved to Common folder.

Reason why I propose this solution is fact what currently and most of the time we have similar outputs (probably around 90% in our main solution and 100% in this repository so far) and in most cases all user need to do is to copy files into common folder.

sfilipi · 2018-07-31T21:18:35Z

Commenting here, because I run in such a case again, trying to enable OLS.
Only one prediction different, out of 4896 generated differing at the 5th decimal point just on osx debug build.

In additional to the per OS baselines, i don;t think it makese sense to compare numbers on the 10th decimal point, which we are doing sometimes.

we could:
1- Have a minimum threshold: don't compare beyond the 6th decimal (is 6th good enough?)
2- Maybe count the failures, and ignore, let's say 1% or 0.5% of the failures per test (what is a good %? @justinormont?)

In addition, if the above cases are not enough, we can put in place OS specific baseline.

Have the current /Baseline folder follow the following structure:

I think a folder structure like such:
/BaselineOutput

/Release
    /Common
           /FastTree
                 /cv_wine.out
                 /traintest_wine.out
    .........
    /OXS
           /FastTree
                 /cv_adult.out
    /Linux
    /Win
/Debug
    /Common
    /OXS
    /Linux
    /Win

A lot of baseline tests have duplicated baselines between debug and release. Allow BaseTestBaseline to check the Common baseline directory for a baseline file first. Fix dotnet#410

) * Enable a QuantileRegression test Enable CommandTrainScoreEvaluateQuantileRegression, add the dataset and the necessary baselines. The only baseline changes were of the form that were caused by https://github.com/dotnet/corefx/issues/31847. * Allow BaseTestBaseline to check Common first. A lot of baseline tests have duplicated baselines between debug and release. Allow BaseTestBaseline to check the Common baseline directory for a baseline file first. Fix #410 * PR feedback Add more info to the README. Add column headers to the housing.txt data file.

Ivanidzo4ka mentioned this issue Jun 26, 2018

LightGBM #392

Merged

Ivanidzo4ka mentioned this issue Jun 26, 2018

unify tests into one folder, modify framework to support different configurations and OS #411

Closed

Ivanidzo4ka mentioned this issue Jul 31, 2018

Make tests baseline comparison more lenient #619

Closed

sfilipi added the test related to tests label Jul 31, 2018

PKingZombieSpy mentioned this issue Aug 28, 2018

Do not duplicate test output where there is no need to. #758

Closed

eerhardt mentioned this issue Oct 9, 2018

Enable a QuantileRegression Test & Fix Duplicated Baseline Files #1193

Merged

shauheen added this to the 1018 milestone Oct 9, 2018

eerhardt closed this as completed in #1193 Oct 10, 2018

ghost locked as resolved and limited conversation to collaborators Mar 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion for test framework improvements. #410

Suggestion for test framework improvements. #410

Ivanidzo4ka commented Jun 26, 2018

codemzs commented Jun 26, 2018

TomFinley commented Jun 26, 2018 •

edited

Loading

Ivanidzo4ka commented Jun 27, 2018 •

edited

Loading

sfilipi commented Jul 31, 2018

Suggestion for test framework improvements. #410

Suggestion for test framework improvements. #410

Comments

Ivanidzo4ka commented Jun 26, 2018

codemzs commented Jun 26, 2018

TomFinley commented Jun 26, 2018 • edited Loading

Ivanidzo4ka commented Jun 27, 2018 • edited Loading

sfilipi commented Jul 31, 2018

TomFinley commented Jun 26, 2018 •

edited

Loading

Ivanidzo4ka commented Jun 27, 2018 •

edited

Loading