Proposal: User Defined Tests for the Operator Scorecard #1049

AlexNPavel · 2019-02-01T19:54:06Z

Description of the change: Make proposal for user defined scorecard tests

Motivation for the change: Allow for more useful functional tests in the scorecard

doc/proposals/scorecard-user-tests.md

joelanford · 2019-02-12T21:18:36Z

doc/proposals/scorecard-user-tests.md

+// Modifications specifies a spec field to change in the CR with the expected results
+type Modification struct {
+    // a map of the spec fields to modify
+    Spec map[string]interface{} `mapstructure:"spec"`


There are probably tests that involve changes outside of the spec (e.g. changing annotations). Should we account for that type of thing?

Also, defining a modification this way has some drawbacks, but I think it works for most use cases.

A few examples of when it doesn't work well are for modifying slices and removing keys. When a modified value is a slice, the entire modified slice has to be included in the modification. Also, it isn't possible to remove a key, only set it to nil. Removing a key might be necessary when there's a semantic difference between not present and nil.

For non-simple use cases, we may want an alternative/additional way to specify modifications. Perhaps the YAML-equivalent of JSON patch. An implementation for YAML exists (https://github.com/krishicks/yaml-patch) if that's something we want to look into.

Somewhat tangentially, another benefit of JSON patch is that there's a test operation that may be useful for implementing the checks.

Using a JSON patch for the modification sounds like a good idea. It's a standard and is quite simple to understand.

Do you think it would make sense to add a JSON patch var to the modification struct and support both using Spec and JSON patches, or do you think we should only support 1 method for simplicity?

Also, do you you think it would be a good idea to add JSON patch test operation support to the expected_resource and expected_status types? That could make it easier for some test cases, especially ones involving arrays, where the current proposal would need to do something like this to get an item in the second element of an array:

spec: containers: - - image: my-image

vs

json_patch: | {"op":"test","path":"/spec/containers/1/image","value":"my-image"}

+1 for supporting JSON patch fields to specify both modifications and the expected resources/status.

The more I think about the current map-style modifications and expectations, the more I think we should support only JSON patch for both modifications and expected resources/status.

If we leave the map-style declarations, we would need to define what method we use to merge and compare lists, maps, strings, numbers, etc, and I think users would end up spending a decent chunk of time troubleshooting why certain things are merging or being compared in certain ways.

If we support only JSON patch, I think the semantics are much more obvious and easier to reason about, both from our perspective as maintainers and from the perspective of someone writing a test. And if we're using YAML, I'd suggest allowing the JSON patch to be defined in YAML.

Here's what's in my head. Feel free to throw darts.

userDefinedTests: - name: Scaling MyApp description: Tests that MyApp properly scales from 1 to 3 timeout: 1m setup: # In addition to `crPath`, maybe we could also allow `cr` # and have the full CR embedded in the test? We could # always add that later. crPath: ./path/to/my/cr.yaml # Would we an expected map here # to wait for the initial CR to be setup? expected: cr: tests: - op: test path: /status/conditions[0]/type value: Ready - op: test path: /status/conditions[0]/status value: true resources: - group: apps version: v1 kind: Deployment name: my-example-deployment tests: - op: test path: /spec/replicas value: 1 crModifications: - op: replace path: spec/count value: 3 expected: cr: tests: - op: test path: /status/conditions[0]/type value: Ready - op: test path: /status/conditions[0]/status value: true resources: - group: apps version: v1 kind: Deployment name: my-example-deployment tests: - op: test path: /spec/replicas value: 3

I don't think the map-style declaration would cause many issues when it comes to how we compare things. There are only 3 non-map/non-array values that you can have in go-yaml/json: number (which can be int/int64/float64; we can just convert any of these types to float64 before comparison), string, and bool. When it comes to maps and array, we just walk down them, not really any different than a JSONPatch would. The main benefit of JSONPatch is that we can more easily define array indices and we can also delete values.

The main benefit of keeping a map-style declaration is that we can have the scorecard functions like the array length checker (another potential example would be a regex matcher). We can't implement that with JSONPatch. We also get to keep the same structure as the actual kubernetes resources, which may be a bit clearer for users.

AlexNPavel · 2019-02-28T18:37:58Z

After some discussion, we have decided that the user defined scorecard tests will instead be run as plugins instead of having a built-in testing system based on the YAML definitions.

The new User-Defined Tests for Scorecard will instead allow users to run their own scripts or binaries that will take a standard input (or environment variables; this is an implementation detail that I will investigate more later) and produce a standard JSON output that can be parsed by the scorecard. This will allow users to run tests written with the Go test framework, Ansible Molecule, or any other framework that the users want. This will simplify the actual implementation in the scorecard while allowing more flexibility for users.

The YAML based testing described in this proposal will still be implemented, but will be available as a plugin that the scorecard can use. It will be a simple testing method and good starting point for operator writers before they start to design more complex end-to-end tests. Scorecard integration support will also be added to the existing Go test framework.

I will be updating the proposal later today (possibly tomorrow) to reflect this new design decision.

lilic

I like the idea of the plugin system, but curious why can't we just have the plugin system alone and user can omit the script? Just an idea, sorry if this question was already answered I wasn't in the meeting. Thanks!

Couple of questions, but overall sgtm 👍

lilic · 2019-03-06T16:01:13Z

doc/proposals/scorecard-user-tests.md

+
+In order to increase the flexibility of the user defined tests and allow users to implement more complex E2E style tests for scorecard,
+the user-defined tests will be implemented via a plugin system. Users will specify the path of the script they wish to run as well
+as environment variable to set for the command. The command would then print out the result as JSON to stdout. Here is an example:


specify the path of the script they wish to run as well

This would be done via a CLI flag? Or how would that look like, if I want to run this via the plugin system?

This would be specified through the scorecard's config file. We would just be adding another section to the config file (as specified below) that we can read with viper.

lilic · 2019-03-06T16:02:03Z

doc/proposals/scorecard-user-tests.md

+
+### Basic YAML Defined Test
+
+A new basic testing system would be added where a user can simply define various aspects of a test. For example, this definition runs a similar test to the memcached-operator scale test from the SDK's e2e test:


The yaml file below would need to be in a predefined directory or passed to the tests?

The config path would be set via an environment variable (see the user_defined_tests config example below)

lilic · 2019-03-06T16:04:51Z

doc/proposals/scorecard-user-tests.md

+
+```yaml
+user_defined_tests:
+- path: "scorecard/simple-scorecard.sh"


What would be an example of the script?

Assuming we have a command called simple-scorecard, we could do:

#!/bin/bash simple-scorecard --config $CONFIG_FILE

For the second example I have here, which would be a modified version of our test framework's tests to support the JSON output, we could do something like this:

#!/bin/bash operator-sdk test local $TEST_DIR --enable-scorecard $ENABLE_SCORECARD --namespaced-manifest $NAMESPACED_MANIFEST --go-test-flags $GO_TEST_FLAGS

The scripts are intended to be just a simple wrappers, but the scripts could do more complex things based on the environment variables if we want them to

shawn-hurley · 2019-03-06T21:31:07Z

doc/proposals/scorecard-user-tests.md

+
+Below is an example of what the JSON output of a test would look like:
+
+```json


I would like this to more certain things

hard failures. Did the test fail in a way that nothing else should be considered

I assume that once plugin could have more than 1 test. I would like to see a summary struct and a list of test results for the individual runs. I think this will be more flexible.

I would also like this to be the output of the scorecard test. I prefer computer readable structured data to the current output. Maybe have a -h option that prints out as a human-readable format, but the default is a computer readable makes more sense to me.

For 1: What would be a good way to handle that? Add another field to the JSON output? Or maybe check the return code (if command returned 0 use JSON output; if command returned not 0, assume hard fail)?

For 2: In this example JSON I'm showing 2 tests from a single plugin: "actions_reflected_in_status" and "my_custom_tests".

I think someone was asking if we could have JSON output for the scorecard itself. I thought I made an issue for that, but it looks like I didn't. I'll create a new issue to track that. It'll probably be pretty much the same output as the example plugin output I have here (the plugin output example is just a JSONified array of TestResult objects)

I would consider a different structure. I think that something that gives someone reading this output easy path to see if everything is good, but also can deep dive into the results if something goes wrong.

So to recap for the updated proposal: To indicate a hard failure(e.g error associate with t.Fatal()) the test result would have state: error and some error message appended to errors.

And if it's just regular errors like with t.Error() the output of a test result would be state: failed or state: partial_pass and error messages appended to errors.

@shawn-hurley Do you think that's clear enough to distinguish between a fatal/hard failure vs regular errors?

Yeah, that makes sense for the fatal/hard failure.

I was also thinking that you would want to give the denominator. Like the number of tests run or possible points? I would also suggest that you note the total_score is a percentage. that is not super clear.

@brianwcook can you have your team take a look at this format to make sure that it works for your team?

@shawn-hurley Updated proposal with your suggestions.

I think we need to also account for the possibility that the plugin executable exits with a non-zero return code and/or with output that is not what we expect. And describe how operator-sdk handles that situation.

shawn-hurley · 2019-03-06T21:34:01Z

doc/proposals/scorecard-user-tests.md

+
+In order to increase the flexibility of the user defined tests and allow users to implement more complex E2E style tests for scorecard,
+the user-defined tests will be implemented via a plugin system. Users will specify the path of the script they wish to run as well
+as environment variable to set for the command. The command would then print out the result as JSON to stdout. Here is an example:


Is this suggesting that you pass a yaml file into the command and it runs the things that way?

I would like to see maybe a directory that contains the runnable (binarys/scipts) that it just runs. This will 1. make creating the image version of this easier, 2. will allow us to create system-wide tests and project-specific tests like the yaml file but using convention rather than another yaml file.

Thoughts?

The way I was thinking to have this work would just be adding a new section to the scorecard's config file that we can read with viper. That section, which I put the example of below, would specify where the scripts are and what environment variables should be set for configuration purposes. Can you give an example of what you would like to change?

+1 on @shawn-hurley's comment.

I would propose that we have the scorecard command just run every executable in <projectRoot>/test/scorecard/bin (or something similar).

And since we support shell scripts, I don't see the need for a separate declaration of environment variables, since those could be setup within the executable itself.

On a related note, are there environment variables that would be helpful to set for ALL scorecard tests?

Lastly, do we have opinions on how users should handle the plugin executables? Check them into git? Use a Makefile that can download or build them? How will we distribute our plugins (e.g. simple-scorecard)?

Instead of adding a new section to the config file, I could just add 1 new flag/config option pointing to the directory where the scorecard scripts are and treat each top level file ending in .sh as a test. And I guess you're right about not needing the environment variables. One env var that may make sense to set for all scripts would be $KUBECONFIG, to make sure all scripts run in the correct cluster.

When it comes to how users should handle the executables, it depends on the user. Probably most ideal would be the script downloading the executable, similar to how we download dep during the CI. A user could also include it in their repo if they feel like that would be better (we do this for the marker executable, since it is quite tiny but would take a long time to compile during CI). As long as the script can run on a generic linux system using bash, it should be fine.

Why can you just run all the top level files that are executable?

I would also suggest that we let folks build these plugins and then decided on how we want to manage them. I think that will help us determine if the plugins folks are writing are generic or super specific if they are tied to a particular project or just to all operators. We can start to see patterns that might help us make better decisions here. I don't think we wait super long, but we should get feedback before making this decision IMO.

@shawn-hurley Yeah, I'll update this to run top level files that are executable instead of just scripts.

hasbro17

Overall SGTM

Only nit being that we should support using JSON patch in our basic test plugin to modify the spec and check resource/status fields.

AlexNPavel · 2019-03-20T17:43:53Z

/cc @shawn-hurley @joelanford

joelanford

Overall sounds pretty good to me. A few more questions, comments, and suggestions.

joelanford · 2019-03-21T21:34:45Z

doc/proposals/scorecard-user-tests.md

+// Modifications specifies a spec field to change in the CR with the expected results
+type Modification struct {
+    // a map of the spec fields to modify
+    Spec map[string]interface{} `mapstructure:"spec"`


The more I think about the current map-style modifications and expectations, the more I think we should support only JSON patch for both modifications and expected resources/status.

If we leave the map-style declarations, we would need to define what method we use to merge and compare lists, maps, strings, numbers, etc, and I think users would end up spending a decent chunk of time troubleshooting why certain things are merging or being compared in certain ways.

If we support only JSON patch, I think the semantics are much more obvious and easier to reason about, both from our perspective as maintainers and from the perspective of someone writing a test. And if we're using YAML, I'd suggest allowing the JSON patch to be defined in YAML.

Here's what's in my head. Feel free to throw darts.

userDefinedTests: - name: Scaling MyApp description: Tests that MyApp properly scales from 1 to 3 timeout: 1m setup: # In addition to `crPath`, maybe we could also allow `cr` # and have the full CR embedded in the test? We could # always add that later. crPath: ./path/to/my/cr.yaml # Would we an expected map here # to wait for the initial CR to be setup? expected: cr: tests: - op: test path: /status/conditions[0]/type value: Ready - op: test path: /status/conditions[0]/status value: true resources: - group: apps version: v1 kind: Deployment name: my-example-deployment tests: - op: test path: /spec/replicas value: 1 crModifications: - op: replace path: spec/count value: 3 expected: cr: tests: - op: test path: /status/conditions[0]/type value: Ready - op: test path: /status/conditions[0]/status value: true resources: - group: apps version: v1 kind: Deployment name: my-example-deployment tests: - op: test path: /spec/replicas value: 3

joelanford · 2019-03-21T21:50:54Z

doc/proposals/scorecard-user-tests.md

+
+In order to increase the flexibility of the user defined tests and allow users to implement more complex E2E style tests for scorecard,
+the user-defined tests will be implemented via a plugin system. Users will specify the path of the script they wish to run as well
+as environment variable to set for the command. The command would then print out the result as JSON to stdout. Here is an example:


+1 on @shawn-hurley's comment.

I would propose that we have the scorecard command just run every executable in <projectRoot>/test/scorecard/bin (or something similar).

And since we support shell scripts, I don't see the need for a separate declaration of environment variables, since those could be setup within the executable itself.

On a related note, are there environment variables that would be helpful to set for ALL scorecard tests?

Lastly, do we have opinions on how users should handle the plugin executables? Check them into git? Use a Makefile that can download or build them? How will we distribute our plugins (e.g. simple-scorecard)?

joelanford · 2019-03-21T22:01:20Z

doc/proposals/scorecard-user-tests.md

+
+Below is an example of what the JSON output of a test would look like:
+
+```json


I think we need to also account for the possibility that the plugin executable exits with a non-zero return code and/or with output that is not what we expect. And describe how operator-sdk handles that situation.

joelanford · 2019-03-21T22:28:11Z

doc/proposals/scorecard-user-tests.md

+to parse and integrate with the other tests. The above JSON design is based on the `TestResult` type with the addition
+of a `state` type, which can be `pass` (earned == max), `partial_pass` (0 < earned < max), `fail` (earned == 0), or `error`
+(fatal error; disregard output score). We also print the number of tests in each state as well as the total score to make
+it easier for users to quickly see what states the tests are in.


Are there any good examples of other projects with a similar plugin system, specifically that expect the plugin to return data in a specific format? I ask because I wonder if we should version the interface so that we can make changes to it in the future without breaking old plugins.

Maybe a totally off the wall idea, but would it make sense to re-use the kubernetes API for this? Similar to how the admission webhook request/response API works? (btw, not suggesting we change to an HTTP interface or anything, just marshal things as k8s-compliant API objects)

Maybe something like:

type ScorecardTest struct { metav1.TypeMeta `json:",inline"` // Spec describes the attributes for the test Spec *ScorecardTestSpec `json:"spec"` // Results describes the results of running the test. // +optional Results *ScorecardTestResults `json:"results,omitempty"` }

And maybe even crazier, perhaps we could write an operator that could execute these tests in-cluster. Food for thought 🙂

Hmm. I like the idea because it allows for versioning, and this will most likely go through a couple of changes over time as we use this more and figure out what else we may need from the output. And reusing some of the other features of the API could be quite useful here.

Running the scorecard tests in-cluster shouldn't be too difficult either, assuming we have a lot of permissions and can actually create all the resources we need. So maybe that could be a runtime mode in the future (kind of like how we have test local and test cluster).

AlexNPavel · 2019-03-25T21:52:17Z

UPDATE: This PR will be splitting into 2 PRs as it has essentially become 2 separate proposals (a plugin system for user defined tests and an actual plugin for the new system)

This section is now a separate PR

joelanford · 2019-03-26T14:53:17Z

doc/proposals/scorecard-user-tests.md

+
+This JSON output would make it simple for others to create scorecard plugins while keeping it simple for the scorecard
+to parse and integrate with the other tests. Each plugin would be considered a separate suite, and the full result of the scorecard
+would be a list of `ScorecardResult`s.


Should the plugin output be the full ScorecardTest object so that the TypeMeta fields are included?

Yes, the plugins should output a full ScorecardTest JSON. The scorecard can then make an array of the results from each plugin for the ScorecardResults section of the main scorecard output.

joelanford

LGTM after addressing one last question.

doc/proposals/scorecard-user-tests.md

doc/proposals/scorecard-user-tests.md: make proposal

97afccb

AlexNPavel added kind/feature Categorizes issue or PR as related to a new feature. scorecard Issue relates to the scorecard subcomponent labels Feb 1, 2019

AlexNPavel requested review from mhrivnak, joelanford, shawn-hurley, lilic, hasbro17, theishshah and estroz February 1, 2019 19:54

openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 1, 2019

AlexNPavel changed the title ~~doc/proposals/scorecard-user-tests.md: make proposal~~ Proposal: User Defined Tests for the Operator Scorecard Feb 1, 2019

AlexNPavel added 3 commits February 1, 2019 15:02

doc/proposals: fix formatting and marshalling directives

b3aa841

doc/proposals: don't use periods as separators

bb93788

doc/proposals: slightly change yaml spec

99d7eb2

lilic reviewed Feb 12, 2019

View reviewed changes

doc/proposals/scorecard-user-tests.md Outdated Show resolved Hide resolved

joelanford reviewed Feb 12, 2019

View reviewed changes

doc/proposals/scorecard-user-tests.md Outdated Show resolved Hide resolved

joelanford reviewed Feb 12, 2019

View reviewed changes

doc/proposals/scorecard-user-tests.md Outdated Show resolved Hide resolved

doc/proposals: add namespace and apiversion

47648de

joelanford reviewed Feb 12, 2019

View reviewed changes

AlexNPavel added 6 commits February 13, 2019 16:37

doc/proposals: create unified expected block

326ffa6

doc/proposals: typo

9923ce0

doc/proposals: remove a stutter

b214e3c

doc/proposals: slight update to some of the yaml vals

120bc71

doc/proposals: update yaml config design

c136f20

doc/proposals: don't make spec an array

0492ef8

AlexNPavel added 3 commits February 28, 2019 14:48

Merge branch 'master' into scorecard-usertest-proposal

e84b754

doc/proposals: update to reflect new design decisions

a83f448

Merge branch 'master' into scorecard-usertest-proposal

623d639

lilic approved these changes Mar 6, 2019

View reviewed changes

shawn-hurley requested changes Mar 6, 2019

View reviewed changes

AlexNPavel added 2 commits March 7, 2019 14:42

proposals/scorecard: add another env to config file

f718511

doc/proposals: update JSON design

355bad4

hasbro17 approved these changes Mar 13, 2019

View reviewed changes

AlexNPavel mentioned this pull request Mar 19, 2019

commands/.../scorecard: add support for JSON output #1228

Merged

doc/proposals: slightly adjust json

2677ba2

openshift-ci-robot requested review from joelanford and shawn-hurley March 20, 2019 17:43

joelanford reviewed Mar 21, 2019

View reviewed changes

update plugin system section

3e52652

AlexNPavel mentioned this pull request Mar 25, 2019

YAML Defined Tests Scorecard Plugin #1241

Closed

doc/proposals: remove YAML Defined Tests section

550aff0

This section is now a separate PR

joelanford reviewed Mar 26, 2019

View reviewed changes

joelanford approved these changes Mar 28, 2019

View reviewed changes

doc/proposals/scorecard-user-tests.md Outdated Show resolved Hide resolved

shawn-hurley approved these changes Mar 28, 2019

View reviewed changes

doc/proposals: slightly update proposal name/description

9f37330

AlexNPavel force-pushed the scorecard-usertest-proposal branch from dee945a to 9f37330 Compare March 28, 2019 21:32

AlexNPavel merged commit e2aec8e into operator-framework:master Mar 29, 2019

AlexNPavel deleted the scorecard-usertest-proposal branch March 29, 2019 18:37


		### Basic YAML Defined Test

		A new basic testing system would be added where a user can simply define various aspects of a test. For example, this definition runs a similar test to the memcached-operator scale test from the SDK's e2e test:


		Below is an example of what the JSON output of a test would look like:

		```json

Proposal: User Defined Tests for the Operator Scorecard #1049

Proposal: User Defined Tests for the Operator Scorecard #1049

Conversation

AlexNPavel commented Feb 1, 2019

joelanford Feb 12, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexNPavel commented Feb 28, 2019

lilic left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shawn-hurley Mar 25, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hasbro17 left a comment

Choose a reason for hiding this comment

AlexNPavel commented Mar 20, 2019

joelanford left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexNPavel commented Mar 25, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joelanford left a comment

Choose a reason for hiding this comment

joelanford Feb 12, 2019 •

edited

Loading

shawn-hurley Mar 25, 2019 •

edited

Loading