Skip to content

Make tests baseline comparison more lenient #619

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sfilipi opened this issue Jul 31, 2018 · 4 comments
Closed

Make tests baseline comparison more lenient #619

sfilipi opened this issue Jul 31, 2018 · 4 comments
Assignees
Labels
test related to tests

Comments

@sfilipi
Copy link
Member

sfilipi commented Jul 31, 2018

There are a few tests that are disabled because the baseline comparisons fail on the 5th decimal for some of the numbers generated, on some OS.

As an example, the RegressorOlsTest() in PredictorTests fails just on the Mac debug version, because only one out of the generated 4896 predictions doesn't match:

baseline:
2625 5 5.09176636 0.091766357421875 0.0084210643544793129

Mac debug run predictions:
2625 5 5.091751 0.0917510986328125 0.0084182641003280878

I think we should make the tests more lenient to failures like this, modifying the comparison with the baseline to:
1- Have a sensitivity threeshold. Compare up to the 4th, or 6th decimal digit.
2- Count the failures, and declare the test as failed if 3% o the predictions/lines differ?

@justinormont @TomFinley @Zruty0 @zeahmed are those acceptable ranges?

@zeahmed
Copy link
Contributor

zeahmed commented Jul 31, 2018

Can we afford to have platform specific thresholds?

@sfilipi
Copy link
Member Author

sfilipi commented Jul 31, 2018

We can, but the downside is that sometimes, like the example above, you have an entire file in, because one line is different.
We can have a system of baselines and overrides for that baseline: if no override default to the baseline.
@Ivanidzo4ka what do you think?

@sfilipi sfilipi added the test related to tests label Jul 31, 2018
@Ivanidzo4ka
Copy link
Contributor

I think what we have too many issues, and nobody reads them :)
#410

@sfilipi
Copy link
Member Author

sfilipi commented Jul 31, 2018

Touche :)
Duplicate of #410

Let's continue the conversation on the original issue.

@sfilipi sfilipi closed this as completed Aug 1, 2018
@ghost ghost locked as resolved and limited conversation to collaborators Mar 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
test related to tests
Projects
None yet
Development

No branches or pull requests

3 participants