-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Retry flakey tests #934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think retry is a pretty interesting idea. Lots of projects have a flaky test or two (AVA included). Is there ever really a need for test.retry(t => {}) |
I think the number attempts needs to be configurable. For the generator, I have all the E2E tests retry twice, but the tests for the Sequelize setup is especially flakey, so I've set this to retry thrice. edit: Could |
Yes. I'm not necessarily opposed to it. Just trying to suggest as many alternatives as possible for discussion.
OK. What if we just allowed test.flakey(t => {}) // retry twice
test.veryFlakey(t => {}) // retry thrice We could extend the test(t => {
t.flakey.is(foo, bar); // test will be retried up to one more time if a `flakey` assertion fails
t.veryFlakey.is(foo, bar); // test will be retried up to one more time if a `veryFlakey` assertion fails
t.is(foo, bar); // if a non-flakey assertion fails - no retries
});
test.flakey(t => {
// any assertion failure causes a retry
});
test.veryFlakey(t => {
// any assertion failure causes up to two retries
}); For the flakey assertions, I'm thinking of some of AVA's CLI tests on Windows where we sometimes don't capture the full We could provide some more detailed summaries about flakey assertions as well:
|
Seems reasonable, but I'm sure you'll get someone who asks for the ability to retry more sometime down the road.
I like that a lot. I don't think mocha has the option of showing if a flakey test was retried. |
Isn't a test that regularly fails three times in a row basically useless? If the repeated failure is because you are running on an overcrowded CI VM, or because of network issues, aren't you just as likely to fail four times in a row at that point? Perhaps we could make the number of retries configurable based on context: "ava": {
"retries": {
"local": [0, 0], // no retries on devs local machine
"ci": [1, 2] // flakey = 1 retries, veryFlakey = 2 retries on CI
}
} Overly complicated version: "ava": {
"retries": {
"local": [0, 0], // no retries on devs local machine
"win32": [0, 1], // veryFlakey retries on Windows machines - even on dev machines
"ci": [1, 2], // flakey = 1 retries, veryFlakey = 2 retries on CI
"appveyor": [2, 3] // extra retries for AppVeyor
}
} |
I completely agree; just saying, I imaging people would want it anyway.
Seems reasonable. Maybe even just something like this: "ava": {
"flakey": 2,
"veryFlakey": 3
} |
For AVA, our "flakey" tests are basically only flakey on Appveyor. They run reliably both locally and on Travis. My concern would be allowing flakiness to creep into our Travis tests unknown to us. |
Would there be an easy way to set some |
We could do it:
|
Yeah, those seem a lot better. |
I think that would be the order of precedence (cli flag, environment, config).
Are environment variables even worth it? Or should we just do config and cli? My thought was that environment variables would allow you to reuse your |
If I tell my CI to run I think if anyone ever comes up with the need for env var configuration then it can be discussed, but until then, nah. There's no other use of env vars like this in this project (from what I can tell). |
A fantastic point. But, IMHO there really isn't a good reason not to make it configurable... As others have pointed out, there are certainly people out there who have setups which would truly merit the option to configure this. CLI flags / To get the default no. of retries, t.flakey.is(foo, bar); // retry assertion twice
test.flakey(t => {
// ...
}) // retry test twice But then it could also a function that accepts an argument like t.flakey(7).is(foo, bar); // retry assertion 7 times as far as setting a custom retry count for a test, the equivalent syntax might be: test.flakey(7)(t -> {
// ...
}); // retry test 7 times But I think the test.flakey(7, 'testTitle', t => {
// ...
}); That however is a departure from the
|
I don't think you would ever want to retry assertions, just tests. In my mind the As for configuring a specific number of retries with per test granularity, I just don't see it. If we give you a global configuration for what |
A few more thoughts:
Now that's definitely outside of the scope of AVA; something you'd want to an external tool to manage. Such a thing could rerun failed tests when some user defined conditions are met. That could actually be really useful for testing sessions on servers where overtime, the state gets more and more muddled and scenarios , or something where the connection/resource access might be intermittent. These were some questions I asked myself when reading this Issue. I'm not sure what use they might be to the AVA team, but I've always found its better to consider a broad range of possibilities before committing to anything. TBH, I can't think of many cases where I, personally, would be using |
Yes. If you are writing tests that pollute state and are dependent on each-other, then we would need to rerun the whole test file again anyway. We aim to encourage atomic tests.
Agreed. |
@jamestalmage
And you're right, it is pretty ridiculous; I would suspect that I'm probably not going to use AVA is pretty damn opinionated, and that's what has made it so successful! I certainly don't begrudge the team this; in fact, I greatly admire such adamancy!! So if you'd rather keep the api minimalistic, I totally respect that. Just throwing out ideas here man. 💭 As I described in my second comment, such a thing would probably be better suited as its own tool. Now I'm imagining Ah good; as it should be! Atomic tests all the way man, better to have the test to blow up 💥 then write ill-conceived tests. |
I am still proposing test(t => {
// If this assertion fails, test fails, no retries. It's basically a fail-fast assertion
t.is(foo, bar);
// If this fails, entire **test** will be retried (not just the assertion).
t.flakey.is(bar, baz);
}); If you do |
This sounds good. @jamestalmage would you mind summarizing the design decisions? Which questions still need to be answered? Also, 🚲 🏠 wise, is it |
I gravitated towards |
Aliasing two correct spellings seems reasonable to me, but if there's a more used spelling, that's the one that should be presented in docs. |
|
I too prefer
We could fail with a helpful error message if
I don't see the point. If it's going to be flaky it's going to be flaky. Configuring the number of retries is fine, but distinguishing levels of flakiness seems overkill. Let's start with just |
Are you suggestiong removing See #934 (comment). |
I think we should go with I also don't see the point of I don't buy the argument for flaky assertion. IMHO if you have flaky assertions mixed with other working assertions, you should just move them into a separate test. I would also consider having flaky tests an anti-pattern. It should only be temporary until you have time to make it non-flaky. Regarding a global
We tend to not add things for imaginary use-cases. Can't we just wait until some actual use-cases come in? I think 2 retries is a very good default, and I can't see why anyone would actually need to change that. I much prefer the "keep it simple until someone complains" methodology. |
Both.
You could write a flaky assertion as a way to sanity check whether your test should be able to pass. E.g. a method that sometimes returns |
Yes, that is what I was thinking. It makes it easier to be explicit about what is allowed to be flaky. In that case though, you probably should write two or more tests. A normal test that tests only the non-flaky part, and a |
@Awk34 mentioned he has a test.retry(3)(t => {
// ...
}); We could eliminate I think allowing configuration is valuable if you've got specific platforms where a test is flaky (looking at you AppVeyor!!). I would probably set up AVA's own tests something like this: {
"ava": {
"flaky": 0
}
"scripts": {
"test": "ava",
"test-appveyor": "ava --flaky=1"
} This would enforce that we only allow tests to be flaky on AppVeyor. Still, there are easy workarounds for this as well: import test from 'ava';
// only allow flaky on AppVeyor
const flaky = process.env.APPVEYOR ? test.flaky : test;
flaky(t => {
// ...
}); |
My point was that you needed to test the result of a flaky function. There wouldn't be any non-flaky part to test without first verifying that the function wasn't flaky to begin with. Taken to an extreme you could argue you don't need |
More than 2 retries feels just too flaky. If you really need that you should rethink your tests. I see
IMHO a test is either non-flaky on all systems, or flaky. If a test is flaky on AppVeyor, it's flaky in nature and should be rethought, or be marked as flaky locally too. |
@sindresorhus I think it's actually 2 tries for my E2E, and 3 tries with Sequelize tests (not retries). Anyhow, the Sequelize tests for my project are too flakey. I don't really use SQL too often, so I've been being lazy about looking into if I can fix those XD. Also as a side note, the reason this story came up is because I'm looking into using AVA for my Yeoman generator, another project you have a hand in. If you're interested: https://github.com/angular-fullstack/generator-angular-fullstack/tree/perf/gen-tests :D |
I think we should:
I'm not in favor of |
As for me, I'm opposed to
|
@vdemedes I'm in favor of |
@novemberborn Yeah, but your reason fits my favorness of |
That might not always be possible.
Of course. |
It is unclear to me if there is a consensus on how to implement this feature. If there is a consensus, I would like to take a stab at implementing this next week. |
I just realized something. We could skip adding test('retry me', retry(async () => {
// retry the test, if it throws
}); I like this solution, because that way we avoid adding more stuff to the API and we let user handle their unique use cases. What do you think? |
I like the idea @vadimdemedes. But I'm not sure that would always work out nicely. I'm thinking of let failed = false;
test('retry me', retry(async t => {
if (!failed) {
failed = true;
t.fail();
}
}); This should retry the test and it should pass the second time, but IIRC calling That said, maybe the |
Decorating
|
I am just a random person on the internet, but if I can throw in my two cents I think a configurable number of retries would be great. The opinions against it have merit, but I also think there's not a clear definition of what "flaky" means. A flaky test is (in theory) one that passes almost all of the time but occasionally doesn't, for reasons which are not obvious. If a failure turns out to be consistent or obvious for whatever reason then the test isn't flaky; it's just a failing (or poorly written) test. Actual flaky tests are often symptoms of deep or intermittent issues. The "flake" is telling you something useful you should be inspecting, And, sometimes, looking into these tests means you may want, or need, to rerun them many, many times to try to recreate the circumstances which caused it to fail. I have a real world example that happened at work recently; I am using Ava to test an Electron app with Spectron (webdriverio). I found that a particular test would fail every now and then for reasons we couldn't explain. It turned out the test was catching a serious (for us) but rare issue our users had reported to us only once or twice before, and a bug in the application code was the culprit. But we needed to run that test hundreds of times and build a better histogram before we could make that connection, determine the magnitude of the problem, and isolate the cause. A convenient way to keep retrying would have been great. I might go one step further and say it would actually be handy to retry ANY test a configurable number of times. What better way to ensure an actual flaky test has been fixed than to be able to easily run it lots of times instead of hoping you catch it in CI again. Or test your tests a little more and maybe catch it sooner. This is probably not practical, but I would use it sometimes. I would also suggest that the term used is "retry" or something similar, and not "flaky". There is some baggage among test professionals that comes with the word "flaky", the least of which is that it's not immediately clear what it means. Just a thought. We're really liking Ava at work so thank you all for your continued hard work and dedication. |
Hi @bbiagas thanks for your comment. I think the scenario you ran into is probably best solved by reproducing the flaky test in a small script that you can run in a loop. What I'm advocating for here is a way to deal with tests that are flaky due to circumstances out of their control, and ideally not bugs. It's helpful if they're retried once or twice, because you can't always fix those circumstances, but not so often that you end up ignoring bugs. I suppose it's the difference between "ugh CI failed again because so-and-so timed out, let's just restart it" and "ugh this test fails a lot, I should look into that". It's a tough balance to strike though. |
Was there any progress on this issue? I'm trying to run selenium tests with ava, and the lack of retry support is painful. (browser integration tests are the most unstable tests...) From the thread, it looks like everybody agrees that retry support would be good, but a lot of ideas about how it should be implemented so it covers every use case under the sun... in the spirit of "don't let perfect be the enemy of good" it would be useful to just have a bare-bones retry implementation similar to mocha's. Or provide the opportunity for end users to implement their own custom retry logic, like TestNG provides a RetryAnalyzer interface to implement?
|
@eirslett right now, Intern is the best test framework for WebDriver tests. They have put a ton of effort into making them more reliable with Leadfoot and DigDug (both ship with Intern). I have sponsored some features over there, through SitePen, for a big project that needed extensive WebDriver tests. And I, too, wanted retry because we had flaky tests. Turns out wallpapering over them isn't such a good idea. 😃 If retries were to be added to AVA, I definitely think
In short, I don't think |
Very interesting suggestions @sholladay! At this stage we're looking for somebody willing to champion this feature. It's not a priority for the core team. |
#1692 contains a proposal for tests to try some assertions, and if necessary discard the results. This is pretty close to the retry behavior talked about here. Intriguingly the retry logic can be built on top of this proposal, in a third-party module. I'm closing this issue in favor of that approach. If you want to help out please drop by #1692. If you're writing such a third-party module let us know and we can give feedback and promote it. Thanks all for the discussion in this thread. There's plenty of ideas for somebody to tackle. |
I've searched around for a bit and couldn't find anything about retrying flakey tests with AVA.
I've been working on trying AVA for the generator-angular-fullstack Yeoman generator to see if it can speed up our tests. Right now in CircleCI our tests take about 15 minutes. This mainly includes running a few different configurations of the generator, and then some test command after that on the scaffolded code, namely linting, client tests, server tests, and E2E tests. The E2E tests can be quite flakey. Right now, we use Mocha, which has the ablility to add
this.retry(2)
to a test to retry it x times if it fails. Is there any plan to add something like this to AVA? If not, it would definitely be helpful to note some way of achieving this in the readme.Environment
ava --version
:0.15.2
node -v
:v5.11.1
npm -v
:3.8.6
The text was updated successfully, but these errors were encountered: