Skip to content

Proposal: User Defined Tests for the Operator Scorecard #1049

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

AlexNPavel
Copy link
Contributor

Description of the change: Make proposal for user defined scorecard tests

Motivation for the change: Allow for more useful functional tests in the scorecard

@AlexNPavel AlexNPavel added kind/feature Categorizes issue or PR as related to a new feature. scorecard Issue relates to the scorecard subcomponent labels Feb 1, 2019
@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 1, 2019
@AlexNPavel AlexNPavel changed the title doc/proposals/scorecard-user-tests.md: make proposal Proposal: User Defined Tests for the Operator Scorecard Feb 1, 2019
// Modifications specifies a spec field to change in the CR with the expected results
type Modification struct {
// a map of the spec fields to modify
Spec map[string]interface{} `mapstructure:"spec"`
Copy link
Member

@joelanford joelanford Feb 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are probably tests that involve changes outside of the spec (e.g. changing annotations). Should we account for that type of thing?

Also, defining a modification this way has some drawbacks, but I think it works for most use cases.

A few examples of when it doesn't work well are for modifying slices and removing keys. When a modified value is a slice, the entire modified slice has to be included in the modification. Also, it isn't possible to remove a key, only set it to nil. Removing a key might be necessary when there's a semantic difference between not present and nil.

For non-simple use cases, we may want an alternative/additional way to specify modifications. Perhaps the YAML-equivalent of JSON patch. An implementation for YAML exists (https://github.com/krishicks/yaml-patch) if that's something we want to look into.

Somewhat tangentially, another benefit of JSON patch is that there's a test operation that may be useful for implementing the checks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a JSON patch for the modification sounds like a good idea. It's a standard and is quite simple to understand.

Do you think it would make sense to add a JSON patch var to the modification struct and support both using Spec and JSON patches, or do you think we should only support 1 method for simplicity?

Also, do you you think it would be a good idea to add JSON patch test operation support to the expected_resource and expected_status types? That could make it easier for some test cases, especially ones involving arrays, where the current proposal would need to do something like this to get an item in the second element of an array:

spec:
  containers:
    -
    - image: my-image

vs

json_patch: |
  {"op":"test","path":"/spec/containers/1/image","value":"my-image"}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for supporting JSON patch fields to specify both modifications and the expected resources/status.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about the current map-style modifications and expectations, the more I think we should support only JSON patch for both modifications and expected resources/status.

If we leave the map-style declarations, we would need to define what method we use to merge and compare lists, maps, strings, numbers, etc, and I think users would end up spending a decent chunk of time troubleshooting why certain things are merging or being compared in certain ways.

If we support only JSON patch, I think the semantics are much more obvious and easier to reason about, both from our perspective as maintainers and from the perspective of someone writing a test. And if we're using YAML, I'd suggest allowing the JSON patch to be defined in YAML.

Here's what's in my head. Feel free to throw darts.

userDefinedTests:
- name: Scaling MyApp
  description: Tests that MyApp properly scales from 1 to 3
  timeout: 1m
  setup:
    # In addition to `crPath`, maybe we could also allow `cr` 
    # and have the full CR embedded in the test? We could
    # always add that later.
    crPath: ./path/to/my/cr.yaml

    # Would we an expected map here
    # to wait for the initial CR to be setup?
    expected:
      cr:
        tests:
        - op: test
          path: /status/conditions[0]/type
          value: Ready
        - op: test
          path: /status/conditions[0]/status
          value: true
      resources:
      - group: apps
        version: v1
        kind: Deployment
        name: my-example-deployment
        tests:
        - op: test
          path: /spec/replicas
          value: 1
  crModifications:
  - op: replace
    path: spec/count
    value: 3
  expected:
    cr:
      tests:
      - op: test
        path: /status/conditions[0]/type
        value: Ready
      - op: test
        path: /status/conditions[0]/status
        value: true
    resources:
    - group: apps
      version: v1
      kind: Deployment
      name: my-example-deployment
      tests:
      - op: test
        path: /spec/replicas
        value: 3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the map-style declaration would cause many issues when it comes to how we compare things. There are only 3 non-map/non-array values that you can have in go-yaml/json: number (which can be int/int64/float64; we can just convert any of these types to float64 before comparison), string, and bool. When it comes to maps and array, we just walk down them, not really any different than a JSONPatch would. The main benefit of JSONPatch is that we can more easily define array indices and we can also delete values.

The main benefit of keeping a map-style declaration is that we can have the scorecard functions like the array length checker (another potential example would be a regex matcher). We can't implement that with JSONPatch. We also get to keep the same structure as the actual kubernetes resources, which may be a bit clearer for users.

@AlexNPavel
Copy link
Contributor Author

After some discussion, we have decided that the user defined scorecard tests will instead be run as plugins instead of having a built-in testing system based on the YAML definitions.

The new User-Defined Tests for Scorecard will instead allow users to run their own scripts or binaries that will take a standard input (or environment variables; this is an implementation detail that I will investigate more later) and produce a standard JSON output that can be parsed by the scorecard. This will allow users to run tests written with the Go test framework, Ansible Molecule, or any other framework that the users want. This will simplify the actual implementation in the scorecard while allowing more flexibility for users.

The YAML based testing described in this proposal will still be implemented, but will be available as a plugin that the scorecard can use. It will be a simple testing method and good starting point for operator writers before they start to design more complex end-to-end tests. Scorecard integration support will also be added to the existing Go test framework.

I will be updating the proposal later today (possibly tomorrow) to reflect this new design decision.

Copy link
Member

@lilic lilic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of the plugin system, but curious why can't we just have the plugin system alone and user can omit the script? Just an idea, sorry if this question was already answered I wasn't in the meeting. Thanks!

Couple of questions, but overall sgtm 👍


In order to increase the flexibility of the user defined tests and allow users to implement more complex E2E style tests for scorecard,
the user-defined tests will be implemented via a plugin system. Users will specify the path of the script they wish to run as well
as environment variable to set for the command. The command would then print out the result as JSON to stdout. Here is an example:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

specify the path of the script they wish to run as well

This would be done via a CLI flag? Or how would that look like, if I want to run this via the plugin system?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be specified through the scorecard's config file. We would just be adding another section to the config file (as specified below) that we can read with viper.


### Basic YAML Defined Test

A new basic testing system would be added where a user can simply define various aspects of a test. For example, this definition runs a similar test to the memcached-operator scale test from the SDK's e2e test:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The yaml file below would need to be in a predefined directory or passed to the tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config path would be set via an environment variable (see the user_defined_tests config example below)


```yaml
user_defined_tests:
- path: "scorecard/simple-scorecard.sh"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be an example of the script?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming we have a command called simple-scorecard, we could do:

#!/bin/bash

simple-scorecard --config $CONFIG_FILE

For the second example I have here, which would be a modified version of our test framework's tests to support the JSON output, we could do something like this:

#!/bin/bash

operator-sdk test local $TEST_DIR --enable-scorecard $ENABLE_SCORECARD --namespaced-manifest $NAMESPACED_MANIFEST --go-test-flags $GO_TEST_FLAGS

The scripts are intended to be just a simple wrappers, but the scripts could do more complex things based on the environment variables if we want them to

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!


Below is an example of what the JSON output of a test would look like:

```json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like this to more certain things

  1. hard failures. Did the test fail in a way that nothing else should be considered
  2. I assume that once plugin could have more than 1 test. I would like to see a summary struct and a list of test results for the individual runs. I think this will be more flexible.

I would also like this to be the output of the scorecard test. I prefer computer readable structured data to the current output. Maybe have a -h option that prints out as a human-readable format, but the default is a computer readable makes more sense to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 1: What would be a good way to handle that? Add another field to the JSON output? Or maybe check the return code (if command returned 0 use JSON output; if command returned not 0, assume hard fail)?

For 2: In this example JSON I'm showing 2 tests from a single plugin: "actions_reflected_in_status" and "my_custom_tests".

I think someone was asking if we could have JSON output for the scorecard itself. I thought I made an issue for that, but it looks like I didn't. I'll create a new issue to track that. It'll probably be pretty much the same output as the example plugin output I have here (the plugin output example is just a JSONified array of TestResult objects)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider a different structure. I think that something that gives someone reading this output easy path to see if everything is good, but also can deep dive into the results if something goes wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to recap for the updated proposal: To indicate a hard failure(e.g error associate with t.Fatal()) the test result would have state: error and some error message appended to errors.

And if it's just regular errors like with t.Error() the output of a test result would be state: failed or state: partial_pass and error messages appended to errors.

@shawn-hurley Do you think that's clear enough to distinguish between a fatal/hard failure vs regular errors?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes sense for the fatal/hard failure.

I was also thinking that you would want to give the denominator. Like the number of tests run or possible points? I would also suggest that you note the total_score is a percentage. that is not super clear.

@brianwcook can you have your team take a look at this format to make sure that it works for your team?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shawn-hurley Updated proposal with your suggestions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to also account for the possibility that the plugin executable exits with a non-zero return code and/or with output that is not what we expect. And describe how operator-sdk handles that situation.


In order to increase the flexibility of the user defined tests and allow users to implement more complex E2E style tests for scorecard,
the user-defined tests will be implemented via a plugin system. Users will specify the path of the script they wish to run as well
as environment variable to set for the command. The command would then print out the result as JSON to stdout. Here is an example:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this suggesting that you pass a yaml file into the command and it runs the things that way?

I would like to see maybe a directory that contains the runnable (binarys/scipts) that it just runs. This will 1. make creating the image version of this easier, 2. will allow us to create system-wide tests and project-specific tests like the yaml file but using convention rather than another yaml file.

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I was thinking to have this work would just be adding a new section to the scorecard's config file that we can read with viper. That section, which I put the example of below, would specify where the scripts are and what environment variables should be set for configuration purposes. Can you give an example of what you would like to change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on @shawn-hurley's comment.

I would propose that we have the scorecard command just run every executable in <projectRoot>/test/scorecard/bin (or something similar).

And since we support shell scripts, I don't see the need for a separate declaration of environment variables, since those could be setup within the executable itself.

On a related note, are there environment variables that would be helpful to set for ALL scorecard tests?

Lastly, do we have opinions on how users should handle the plugin executables? Check them into git? Use a Makefile that can download or build them? How will we distribute our plugins (e.g. simple-scorecard)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding a new section to the config file, I could just add 1 new flag/config option pointing to the directory where the scorecard scripts are and treat each top level file ending in .sh as a test. And I guess you're right about not needing the environment variables. One env var that may make sense to set for all scripts would be $KUBECONFIG, to make sure all scripts run in the correct cluster.

When it comes to how users should handle the executables, it depends on the user. Probably most ideal would be the script downloading the executable, similar to how we download dep during the CI. A user could also include it in their repo if they feel like that would be better (we do this for the marker executable, since it is quite tiny but would take a long time to compile during CI). As long as the script can run on a generic linux system using bash, it should be fine.

Copy link
Member

@shawn-hurley shawn-hurley Mar 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can you just run all the top level files that are executable?

I would also suggest that we let folks build these plugins and then decided on how we want to manage them. I think that will help us determine if the plugins folks are writing are generic or super specific if they are tied to a particular project or just to all operators. We can start to see patterns that might help us make better decisions here. I don't think we wait super long, but we should get feedback before making this decision IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shawn-hurley Yeah, I'll update this to run top level files that are executable instead of just scripts.

Copy link
Contributor

@hasbro17 hasbro17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall SGTM

Only nit being that we should support using JSON patch in our basic test plugin to modify the spec and check resource/status fields.

@AlexNPavel
Copy link
Contributor Author

/cc @shawn-hurley @joelanford

Copy link
Member

@joelanford joelanford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall sounds pretty good to me. A few more questions, comments, and suggestions.

// Modifications specifies a spec field to change in the CR with the expected results
type Modification struct {
// a map of the spec fields to modify
Spec map[string]interface{} `mapstructure:"spec"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about the current map-style modifications and expectations, the more I think we should support only JSON patch for both modifications and expected resources/status.

If we leave the map-style declarations, we would need to define what method we use to merge and compare lists, maps, strings, numbers, etc, and I think users would end up spending a decent chunk of time troubleshooting why certain things are merging or being compared in certain ways.

If we support only JSON patch, I think the semantics are much more obvious and easier to reason about, both from our perspective as maintainers and from the perspective of someone writing a test. And if we're using YAML, I'd suggest allowing the JSON patch to be defined in YAML.

Here's what's in my head. Feel free to throw darts.

userDefinedTests:
- name: Scaling MyApp
  description: Tests that MyApp properly scales from 1 to 3
  timeout: 1m
  setup:
    # In addition to `crPath`, maybe we could also allow `cr` 
    # and have the full CR embedded in the test? We could
    # always add that later.
    crPath: ./path/to/my/cr.yaml

    # Would we an expected map here
    # to wait for the initial CR to be setup?
    expected:
      cr:
        tests:
        - op: test
          path: /status/conditions[0]/type
          value: Ready
        - op: test
          path: /status/conditions[0]/status
          value: true
      resources:
      - group: apps
        version: v1
        kind: Deployment
        name: my-example-deployment
        tests:
        - op: test
          path: /spec/replicas
          value: 1
  crModifications:
  - op: replace
    path: spec/count
    value: 3
  expected:
    cr:
      tests:
      - op: test
        path: /status/conditions[0]/type
        value: Ready
      - op: test
        path: /status/conditions[0]/status
        value: true
    resources:
    - group: apps
      version: v1
      kind: Deployment
      name: my-example-deployment
      tests:
      - op: test
        path: /spec/replicas
        value: 3


In order to increase the flexibility of the user defined tests and allow users to implement more complex E2E style tests for scorecard,
the user-defined tests will be implemented via a plugin system. Users will specify the path of the script they wish to run as well
as environment variable to set for the command. The command would then print out the result as JSON to stdout. Here is an example:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on @shawn-hurley's comment.

I would propose that we have the scorecard command just run every executable in <projectRoot>/test/scorecard/bin (or something similar).

And since we support shell scripts, I don't see the need for a separate declaration of environment variables, since those could be setup within the executable itself.

On a related note, are there environment variables that would be helpful to set for ALL scorecard tests?

Lastly, do we have opinions on how users should handle the plugin executables? Check them into git? Use a Makefile that can download or build them? How will we distribute our plugins (e.g. simple-scorecard)?


Below is an example of what the JSON output of a test would look like:

```json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to also account for the possibility that the plugin executable exits with a non-zero return code and/or with output that is not what we expect. And describe how operator-sdk handles that situation.

to parse and integrate with the other tests. The above JSON design is based on the `TestResult` type with the addition
of a `state` type, which can be `pass` (earned == max), `partial_pass` (0 < earned < max), `fail` (earned == 0), or `error`
(fatal error; disregard output score). We also print the number of tests in each state as well as the total score to make
it easier for users to quickly see what states the tests are in.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any good examples of other projects with a similar plugin system, specifically that expect the plugin to return data in a specific format? I ask because I wonder if we should version the interface so that we can make changes to it in the future without breaking old plugins.

Maybe a totally off the wall idea, but would it make sense to re-use the kubernetes API for this? Similar to how the admission webhook request/response API works? (btw, not suggesting we change to an HTTP interface or anything, just marshal things as k8s-compliant API objects)

Maybe something like:

type ScorecardTest struct {
	metav1.TypeMeta `json:",inline"`
	// Spec describes the attributes for the test
	Spec *ScorecardTestSpec `json:"spec"`

	// Results describes the results of running the test.
	// +optional
	Results *ScorecardTestResults `json:"results,omitempty"`
}

And maybe even crazier, perhaps we could write an operator that could execute these tests in-cluster. Food for thought 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I like the idea because it allows for versioning, and this will most likely go through a couple of changes over time as we use this more and figure out what else we may need from the output. And reusing some of the other features of the API could be quite useful here.

Running the scorecard tests in-cluster shouldn't be too difficult either, assuming we have a lot of permissions and can actually create all the resources we need. So maybe that could be a runtime mode in the future (kind of like how we have test local and test cluster).

@AlexNPavel
Copy link
Contributor Author

UPDATE: This PR will be splitting into 2 PRs as it has essentially become 2 separate proposals (a plugin system for user defined tests and an actual plugin for the new system)

This section is now a separate PR

This JSON output would make it simple for others to create scorecard plugins while keeping it simple for the scorecard
to parse and integrate with the other tests. Each plugin would be considered a separate suite, and the full result of the scorecard
would be a list of `ScorecardResult`s.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the plugin output be the full ScorecardTest object so that the TypeMeta fields are included?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the plugins should output a full ScorecardTest JSON. The scorecard can then make an array of the results from each plugin for the ScorecardResults section of the main scorecard output.

Copy link
Member

@joelanford joelanford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after addressing one last question.

@AlexNPavel AlexNPavel force-pushed the scorecard-usertest-proposal branch from dee945a to 9f37330 Compare March 28, 2019 21:32
@AlexNPavel AlexNPavel merged commit e2aec8e into operator-framework:master Mar 29, 2019
@AlexNPavel AlexNPavel deleted the scorecard-usertest-proposal branch March 29, 2019 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. scorecard Issue relates to the scorecard subcomponent size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants