-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Fail when new snapshots are created in CI #1585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It seems like a reasonable requirement to prevent false positives. I think an error message for this would need to describe the reasoning and fix, something along the lines of:
|
How about:
(This would be shown as the message for the |
👍 |
Thanks all for the excellent contributions to date, much appreciated! |
@AJamesPhillips would you be willing to help out with this? |
Hi @novemberborn , I had a quick look at the code. I would prefer to add a flag which if set, will fail the tests if a snapshot can not be located, as opposed to only failing if on a ci environment and a snapshot can not be located. |
If you run tests in a Docker container, how do you update the snapshots? The idea is that during development you'd notice the new snapshot files (since you need to commit them). I think if we implement the CI flag then you could set |
Yes apologies that wasn't clear @novemberborn . Locally we normally run and update our tests outside of docker. On our jenkins ci server we build a docker image and run the ava tests inside the container. In this case I was running the ci build and tests locally to check it was working correctly. It wasn't because the build step did not copy the snapshots to the right file name and location for ava to see them, so ava then generated new snapshots which would have masked a subsequent real error.
That would work, though setting CI=1 when running a container locally is strange. Not saying I object, just saying I think it wouldn't be obviously for others. In our case it would be reasonable as it would sit in the script file we call on the ci server to setup, build and run the tests in the container. So setting CI=1 inside a file like |
Arguably you'd want to forward env variables from Jenkins to your container, so I think it's reasonably that you'd need to do the same when testing your container setup outside of Jenkins. I admit it's not immediately obvious, but that also goes for having a configuration option. |
I would love this as a flag as well, and IMO it should be enabled by default in all environments -- not just CI. I've never liked the behavior of auto-creating new snapshots without Pragmatically, it complicates scenarios like running a test suite as a commit/push/prepublish hook. I've often successfully pushed commits (relying on my git hook to fail if the tests do anything other than succeed) only to realize that the test suite passed but dirtied the repo, so my (incorrect) commit shouldn't have gone through. |
Oh that's a good point, I wonder if we could detect that, too? I hear what you're saying @billyjanitsch, though I'm reticent about having a flag to disable this behavior. Instead, what if we'd write snapshot updates to a temporary location, exit with an instructive failure and exit code of |
+1 -- this is all I really want (although the rest of your idea sounds great too 😄) I am a bit limited on OSS time at the moment but when I free up I can take a look! |
I've opened #2099 to track this. |
I needed a workaround for this feature. The good news:
The rest of this note documents my use-case. The use caseI'm building a code generator, which generates some boiler-plate, which is then managed by my users. The generated boiler-plate seeks to encourage good practices, and generates test cases. These test cases rely on snapshots. I'd like to ensure that code-generation workflow is tested, especially cross-platform. The generator implements a simple (if time-consuming) test-case:
The problemThat last step fails on the CI server. To be clear, step 5 is running code on the generated code, Ideal solutionI'd like an environment variable to control this behavior, not a flag. This way the boiler plate AVA_FORCE_CI fits the bill perfectly! |
Could you elaborate why you need
That's a private variable. Don't use it. |
I like to write my functions like so:
A very convenient way to test these, is
While this looks like the functions are "pure" (meaning on side-effects), in reality the functions are interfacing with external APIs with 100s of lines of output (and state, but that's not important here). It's very tedious to build exact matchers for such output. Moreover, I don't control the external APIs and cannot make presumptions on API change management process. Snapshots guard me against any changes - well managed or not. When a test case fails, I only need to reason about whether the change is impacting my code or not. If not, I just update snapshots and move on. The whole setup is a productivity booster - not to mention a worry minimizer. Much of my use of
I realize. This note was meant provide a good use-case and a nudge for why it deserves better treatment. I can't claim my code generator is a meta-program, however it is still meta in the sense that it generates code. Given the investment I've made in building my testing strategy around What is curious is that the release notes mention this being introduced in v2.0.0. I was however running it on github actions with v3.0.0-beta2. Perhaps it was the detection of Please do consider more formal treatment of AVA_FORCE_CI or provide some equivalent functionality. |
For snapshot testing to work, you need to compare against a known-good snapshot. That should involve a human looking at the generated snapshot report, ideally during code review. If you generate a snapshot during CI then that does not happen, and you won't detect unintended changes. It's not clear to me yet why you need to create new snapshots in CI. Where is your known-good snapshot coming from?
Yes I reckon so. |
Perhaps my speaking in abstract terms is causing the point to get muddled. My program is very similar to create-react-app or yeoman like programs: It's a program that generates a program. To set the stage, there are three "programs" that we'll refer to:
That is to say: the meta-program takes user-input and customizes the template-program, to yield the generated-program. The template-program
The generated-program
The meta-program
Step 2 prevents the snapshots from being "generated". In the CI environment step 4 fails, since it lacks previous snapshots. Which is the problem I am trying to solve. This sounds complicated, but is simpler than it looks. Importantly, once it works, is extremely empowering to making changes all around. Hopefully this explains it sufficiently. Please let me know if it's still not clear. I completely understand the underlying intent of preventing snapshots from being built in the CI environment. While not a vanilla use of |
So the generated program ends up generating snapshots? How do you ensure that they're "good"?
I am very interested in exposing low-level building blocks so you can build more powerful testing solutions. Just need to figure out what the right blocks are 😄 |
Exactly! The meta-program test generates it on the CI system, verifies that it still compiles+executes and then discards it.
The template-program is responsible for ensuring that the snapshots are sufficient. The customizations do not involve complex logic - just generating the right artifacts, with the right names, that are wired up together. The compilation tests the renaming was correct. Most of the generated-code is a no-op scaffolding.
👍 |
Okay I'm still confused what code generates the snapshots. It sounds like maybe a "write snapshot without asserting" concept would do the trick, but I also wonder if you can't emit something you can use with |
This is the simplest I can boil down the code to. Hopefully it'll help: template-program.js
meta-program.js
generated-program.js
You are essentially asking for a defense of my testing strategy and the use of snapshots! I'm game, but will have to be careful - it can slip out of scope very quickly. While the simple boilerplate above only hints at it, the "running in production" scenario involves Moreover, I don't control these structures. So what I really need to know is
Recreating an expected object or cherry picking things to assert is a nightmare. As a general rule, since discovering snapshots, I try (very hard!) to avoid This testing strategy for 3rd party interfacing uses a few other tricks I am glossing over, but those are not pertinent here. It probably deserves a blog post, but I need to ship first... I have noticed in other threads that you don't see snapshots as the crown-jewel of testing. |
Also, the template program was trivialized for this presentation, but it's in the level of complexity of a Rails/Django model+controller in reality. It has enough substance to warrant this approach. |
I'm trying to understand your use case, to see how AVA can help not just you, but other use cases like yours.
How do you determine this if the snapshot is created by the generated program in CI?
You seem to like AVA's snapshots. Who do you think spent months on that implementation? 😉 |
When building the template-program, I am manually validating that the snapshot is correct. The generator of the scaffolding (the meta-program), does not generate logic - just renames The logic of the generated program is untouched, so can be presumed to be correct. Which leads to the important corollary: It's presumed. If the test runs, the snapshot are guaranteed correct. The question then is - do I need to run the tests?
The last one is really corner case and surprised me, but I ran into it, of all things, with generating Basically, that empty run of the test cycle has proven its value to me - until it fails, I can live worry free. Initially, some corner cases slide past this strategy, but a little run time and it catches most if not all of them. The real value of the test cases is to provide the user of the generated-code with a convention of how test cases need to be written. But that is completely outside the scope of this discussion. Reading through, I missed addressing this comment of yours:
It I am understanding what you mean correctly, actually not. I do not want to fiddle with the generated code to execute it or it's tests when executed in the context of the meta-program testing. The environment variable is the ideal solution. The constraint you have in place is exactly right for all programs that do something, Which The only case I think where it does not apply is when a program generates another program. Typically programs that are generated by programs (at least circa 2020), don't do very Even in the example presented, the scaffolding does not generate the 3rd party API calls. The human receiving the generated-program adapts it to do so. However, the meta-program still needs an assurance that it generated a program that will compile and run. Which only requires an execution of the tests - at least it's the simplest. For all this discussion, it's the ideal feature request - don't change a thing! Especially do not change the coupling between AVA_FORCE_CI="not_ci" and its use in preventing assertion on generating snapshots on CI. At best document it for clarity.
touché! Thank you for that effort. It's so elegant and tamper-proof for normal usage that I can't imagine developing code without it anymore. I would still love the debate though - this is not the first time I've seen you advocate for |
Right. So when you run the test suite for the template program, you check snapshots. And then you run the test suite for the generated program and you're basically making sure there's no exceptions. I think the best approach for your use case would be to wrap the const snapshotOrPass = (t, value) => {
if (testingTemplateProgram) {
t.snapshot(value)
} else {
t.pass()
}
} And then use It's a little awkward (at least until you can install custom assertions on Heck, you could rewrite the test files in the generation process to replace the |
Currently, using AVA_FORCE_CI works just fine without any more code and is really elegant. The solution being proposed will have none of these attributes. Not to mention writing a boatload Not sure I understand your strong objection to treating AVA_FORCE_CI as part of the public interface. It's clear you use this for more than treating snapshots differently in the CI system. FWIW, I did find a similar problem with husky, and the solution there too is to provide an environment variable. As we have discussed previously, this issue is also rooted in a generator program being tested, on CI servers! |
It exists so that AVA's own test suite can run reliably in Node.js' CIGTM environment. It may change so that its internal use is more clear. Nothing's stopping you from using it, of course. I think we've exhausted this discussion. |
Thank you for indulging me. I'll keep a look out for release notes in the future, but if you do decide to change its behavior, I'd appreciate a heads up to plan for the change. |
When running tests in CI environments, if new snapshots are created it implies the tests are behaving differently then during development. This can hide bugs. The snapshot assertions should fail, rather than silently creating the snapshot and passing.
@avajs/core and others, what do you think?
The text was updated successfully, but these errors were encountered: