Skip to content

Proposal: User Defined Tests for the Operator Scorecard #1049

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Changes from 14 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
97afccb
doc/proposals/scorecard-user-tests.md: make proposal
AlexNPavel Jan 31, 2019
b3aa841
doc/proposals: fix formatting and marshalling directives
AlexNPavel Feb 1, 2019
bb93788
doc/proposals: don't use periods as separators
AlexNPavel Feb 1, 2019
99d7eb2
doc/proposals: slightly change yaml spec
AlexNPavel Feb 12, 2019
47648de
doc/proposals: add namespace and apiversion
AlexNPavel Feb 12, 2019
326ffa6
doc/proposals: create unified `expected` block
AlexNPavel Feb 14, 2019
9923ce0
doc/proposals: typo
AlexNPavel Feb 14, 2019
b214e3c
doc/proposals: remove a stutter
AlexNPavel Feb 15, 2019
120bc71
doc/proposals: slight update to some of the yaml vals
AlexNPavel Feb 15, 2019
c136f20
doc/proposals: update yaml config design
AlexNPavel Feb 15, 2019
0492ef8
doc/proposals: don't make `spec` an array
AlexNPavel Feb 19, 2019
e84b754
Merge branch 'master' into scorecard-usertest-proposal
AlexNPavel Feb 28, 2019
a83f448
doc/proposals: update to reflect new design decisions
AlexNPavel Mar 1, 2019
623d639
Merge branch 'master' into scorecard-usertest-proposal
AlexNPavel Mar 1, 2019
f718511
proposals/scorecard: add another env to config file
AlexNPavel Mar 7, 2019
355bad4
doc/proposals: update JSON design
AlexNPavel Mar 11, 2019
2677ba2
doc/proposals: slightly adjust json
AlexNPavel Mar 20, 2019
3e52652
update plugin system section
AlexNPavel Mar 25, 2019
550aff0
doc/proposals: remove YAML Defined Tests section
AlexNPavel Mar 25, 2019
9f37330
doc/proposals: slightly update proposal name/description
AlexNPavel Mar 28, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
153 changes: 153 additions & 0 deletions doc/proposals/scorecard-user-tests.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
# User Defined Tests for the Operator Scorecard

Implementation Owner: AlexNPavel

Status: Draft

[Background](#Background)

[Goals](#Goals)

[Design overview](#Design_overview)

[User facing usage](#User_facing_usage)

[Observations and open questions](#Observations_and_open_questions)

## Background

The operator scorecard is intended to allow users to run a generic set of tests on their operators. Some simple checks can be performed, but more complicated
tests verifying that the operator actually works are not possible to do in a generic way. This leads to some of the functional tests in the current scorecard
implementation to be inaccurate or too insufficient to be useful. For more useful functional scorecard tests, we need to allow some basic user input for tests.

## Goals

- Implement user-defined scorecard tests
- Replace existing "Operator actions are reflected in status" and "Writing into CRs has effects" tests with user-defined tests

## Design overview

### Basic YAML Defined Test

A new basic testing system would be added where a user can simply define various aspects of a test. For example, this definition runs a similar test to the memcached-operator scale test from the SDK's e2e test:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The yaml file below would need to be in a predefined directory or passed to the tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config path would be set via an environment variable (see the user_defined_tests config example below)


```yaml
functional_tests:
- cr: "deploy/crds/cache_v1alpha1_memcached_cr.yaml"
expected:
resources:
- apiVersion: apps/v1
kind: Deployment
metadata:
name: example-memcached
status:
readyReplicas: 3
spec:
template:
spec:
containers:
- image: memcached:1.4.36-alpine
status:
scorecard_function_length:
nodes: 3
modifications:
- spec:
size: 4
expected:
resources:
- kind: Deployment
name: "example_memcached"
status:
readyReplicas: 4
status:
scorecard_function_length:
nodes: 4
```

This is what the golang structs would look like, including comments describing each field:

```go
// Struct containing a user defined test. User passes tests as an array using the `functional_tests` viper config
type UserDefinedTest struct {
// Path to cr to be used for testing
CRPath string `mapstructure:"cr"`
// Expected resources and status
Expected Expected `mapstructure:"expected"`
// Sub-tests modifying a few fields with expected changes
Modifications []Modification `mapstructure:"modifications"`
}

type Expected struct {
// Resources expected to be created after the operator reacts to the CR
Resources []map[string]interface{} `mapstructure:"resources"`
// Expected values in CR's status after the operator reacts to the CR
Status map[string]interface{} `mapstructure:"status"`
}

// Modifications specifies a spec field to change in the CR with the expected results
type Modification struct {
// a map of the spec fields to modify
Spec map[string]interface{} `mapstructure:"spec"`
Copy link
Member

@joelanford joelanford Feb 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are probably tests that involve changes outside of the spec (e.g. changing annotations). Should we account for that type of thing?

Also, defining a modification this way has some drawbacks, but I think it works for most use cases.

A few examples of when it doesn't work well are for modifying slices and removing keys. When a modified value is a slice, the entire modified slice has to be included in the modification. Also, it isn't possible to remove a key, only set it to nil. Removing a key might be necessary when there's a semantic difference between not present and nil.

For non-simple use cases, we may want an alternative/additional way to specify modifications. Perhaps the YAML-equivalent of JSON patch. An implementation for YAML exists (https://github.com/krishicks/yaml-patch) if that's something we want to look into.

Somewhat tangentially, another benefit of JSON patch is that there's a test operation that may be useful for implementing the checks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a JSON patch for the modification sounds like a good idea. It's a standard and is quite simple to understand.

Do you think it would make sense to add a JSON patch var to the modification struct and support both using Spec and JSON patches, or do you think we should only support 1 method for simplicity?

Also, do you you think it would be a good idea to add JSON patch test operation support to the expected_resource and expected_status types? That could make it easier for some test cases, especially ones involving arrays, where the current proposal would need to do something like this to get an item in the second element of an array:

spec:
  containers:
    -
    - image: my-image

vs

json_patch: |
  {"op":"test","path":"/spec/containers/1/image","value":"my-image"}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for supporting JSON patch fields to specify both modifications and the expected resources/status.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about the current map-style modifications and expectations, the more I think we should support only JSON patch for both modifications and expected resources/status.

If we leave the map-style declarations, we would need to define what method we use to merge and compare lists, maps, strings, numbers, etc, and I think users would end up spending a decent chunk of time troubleshooting why certain things are merging or being compared in certain ways.

If we support only JSON patch, I think the semantics are much more obvious and easier to reason about, both from our perspective as maintainers and from the perspective of someone writing a test. And if we're using YAML, I'd suggest allowing the JSON patch to be defined in YAML.

Here's what's in my head. Feel free to throw darts.

userDefinedTests:
- name: Scaling MyApp
  description: Tests that MyApp properly scales from 1 to 3
  timeout: 1m
  setup:
    # In addition to `crPath`, maybe we could also allow `cr` 
    # and have the full CR embedded in the test? We could
    # always add that later.
    crPath: ./path/to/my/cr.yaml

    # Would we an expected map here
    # to wait for the initial CR to be setup?
    expected:
      cr:
        tests:
        - op: test
          path: /status/conditions[0]/type
          value: Ready
        - op: test
          path: /status/conditions[0]/status
          value: true
      resources:
      - group: apps
        version: v1
        kind: Deployment
        name: my-example-deployment
        tests:
        - op: test
          path: /spec/replicas
          value: 1
  crModifications:
  - op: replace
    path: spec/count
    value: 3
  expected:
    cr:
      tests:
      - op: test
        path: /status/conditions[0]/type
        value: Ready
      - op: test
        path: /status/conditions[0]/status
        value: true
    resources:
    - group: apps
      version: v1
      kind: Deployment
      name: my-example-deployment
      tests:
      - op: test
        path: /spec/replicas
        value: 3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the map-style declaration would cause many issues when it comes to how we compare things. There are only 3 non-map/non-array values that you can have in go-yaml/json: number (which can be int/int64/float64; we can just convert any of these types to float64 before comparison), string, and bool. When it comes to maps and array, we just walk down them, not really any different than a JSONPatch would. The main benefit of JSONPatch is that we can more easily define array indices and we can also delete values.

The main benefit of keeping a map-style declaration is that we can have the scorecard functions like the array length checker (another potential example would be a regex matcher). We can't implement that with JSONPatch. We also get to keep the same structure as the actual kubernetes resources, which may be a bit clearer for users.

// Expected resources and status
Expected Expected `mapstructure:"expected"`
}
```

For `Status` and `Resources` fields, we can implement a bit of extra computation instead of simple string checking. For instance,
in the memcached-operator test, we should expect that the length of the `nodes` field (which is an array) has a certain length. To implement functions like
these, we can create some functions for these checks that are prepended by `scorecard_function_` and take an array of objects. For instance, in the above
example, `scorecard_function_length` would check that each field listed under it matches the specified length (like `nodes: 4`). If the yaml key does not
start with `scorecard_function_`, we do a simple match (like `status/readyReplicas: 4`).

This design would allow us to replace the old "Operator actions are reflected in status" (which would be tested by the `expected/status` check) and
"Writing into CRs has an effect" (which would be tested by the `expected/resources` check) tests.

### Plugin System

In order to increase the flexibility of the user defined tests and allow users to implement more complex E2E style tests for scorecard,
the user-defined tests will be implemented via a plugin system. Users will specify the path of the script they wish to run as well
as environment variable to set for the command. The command would then print out the result as JSON to stdout. Here is an example:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

specify the path of the script they wish to run as well

This would be done via a CLI flag? Or how would that look like, if I want to run this via the plugin system?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be specified through the scorecard's config file. We would just be adding another section to the config file (as specified below) that we can read with viper.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this suggesting that you pass a yaml file into the command and it runs the things that way?

I would like to see maybe a directory that contains the runnable (binarys/scipts) that it just runs. This will 1. make creating the image version of this easier, 2. will allow us to create system-wide tests and project-specific tests like the yaml file but using convention rather than another yaml file.

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I was thinking to have this work would just be adding a new section to the scorecard's config file that we can read with viper. That section, which I put the example of below, would specify where the scripts are and what environment variables should be set for configuration purposes. Can you give an example of what you would like to change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on @shawn-hurley's comment.

I would propose that we have the scorecard command just run every executable in <projectRoot>/test/scorecard/bin (or something similar).

And since we support shell scripts, I don't see the need for a separate declaration of environment variables, since those could be setup within the executable itself.

On a related note, are there environment variables that would be helpful to set for ALL scorecard tests?

Lastly, do we have opinions on how users should handle the plugin executables? Check them into git? Use a Makefile that can download or build them? How will we distribute our plugins (e.g. simple-scorecard)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding a new section to the config file, I could just add 1 new flag/config option pointing to the directory where the scorecard scripts are and treat each top level file ending in .sh as a test. And I guess you're right about not needing the environment variables. One env var that may make sense to set for all scripts would be $KUBECONFIG, to make sure all scripts run in the correct cluster.

When it comes to how users should handle the executables, it depends on the user. Probably most ideal would be the script downloading the executable, similar to how we download dep during the CI. A user could also include it in their repo if they feel like that would be better (we do this for the marker executable, since it is quite tiny but would take a long time to compile during CI). As long as the script can run on a generic linux system using bash, it should be fine.

Copy link
Member

@shawn-hurley shawn-hurley Mar 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can you just run all the top level files that are executable?

I would also suggest that we let folks build these plugins and then decided on how we want to manage them. I think that will help us determine if the plugins folks are writing are generic or super specific if they are tied to a particular project or just to all operators. We can start to see patterns that might help us make better decisions here. I don't think we wait super long, but we should get feedback before making this decision IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shawn-hurley Yeah, I'll update this to run top level files that are executable instead of just scripts.


```yaml
user_defined_tests:
- path: "scorecard/simple-scorecard.sh"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be an example of the script?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming we have a command called simple-scorecard, we could do:

#!/bin/bash

simple-scorecard --config $CONFIG_FILE

For the second example I have here, which would be a modified version of our test framework's tests to support the JSON output, we could do something like this:

#!/bin/bash

operator-sdk test local $TEST_DIR --enable-scorecard $ENABLE_SCORECARD --namespaced-manifest $NAMESPACED_MANIFEST --go-test-flags $GO_TEST_FLAGS

The scripts are intended to be just a simple wrappers, but the scripts could do more complex things based on the environment variables if we want them to

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

env:
- CONFIG_FILE: "scorecard/simple-scorecard.yaml"
- path: "scorecard/go-test.sh"
env:
- ENABLE_SCORECARD: true
- NAMESPACED_MANIFEST: "deploy/namespaced_init.yaml"
- GO_TEST_FLAGS: "-parallel=1"
```

The above is an example of a user-defined test suite with 2 tests: the new simple scorecard tests described above and a test
built using the Operator SDK's test framework that's been modified to be able to output results in the standardized JSON output.

Below is an example of what the JSON output of a test would look like:

```json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like this to more certain things

  1. hard failures. Did the test fail in a way that nothing else should be considered
  2. I assume that once plugin could have more than 1 test. I would like to see a summary struct and a list of test results for the individual runs. I think this will be more flexible.

I would also like this to be the output of the scorecard test. I prefer computer readable structured data to the current output. Maybe have a -h option that prints out as a human-readable format, but the default is a computer readable makes more sense to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 1: What would be a good way to handle that? Add another field to the JSON output? Or maybe check the return code (if command returned 0 use JSON output; if command returned not 0, assume hard fail)?

For 2: In this example JSON I'm showing 2 tests from a single plugin: "actions_reflected_in_status" and "my_custom_tests".

I think someone was asking if we could have JSON output for the scorecard itself. I thought I made an issue for that, but it looks like I didn't. I'll create a new issue to track that. It'll probably be pretty much the same output as the example plugin output I have here (the plugin output example is just a JSONified array of TestResult objects)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider a different structure. I think that something that gives someone reading this output easy path to see if everything is good, but also can deep dive into the results if something goes wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to recap for the updated proposal: To indicate a hard failure(e.g error associate with t.Fatal()) the test result would have state: error and some error message appended to errors.

And if it's just regular errors like with t.Error() the output of a test result would be state: failed or state: partial_pass and error messages appended to errors.

@shawn-hurley Do you think that's clear enough to distinguish between a fatal/hard failure vs regular errors?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes sense for the fatal/hard failure.

I was also thinking that you would want to give the denominator. Like the number of tests run or possible points? I would also suggest that you note the total_score is a percentage. that is not super clear.

@brianwcook can you have your team take a look at this format to make sure that it works for your team?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shawn-hurley Updated proposal with your suggestions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to also account for the possibility that the plugin executable exits with a non-zero return code and/or with output that is not what we expect. And describe how operator-sdk handles that situation.

{
"actions_reflected_in_status": {
"description": "The operator correctly updates the CR's status",
"earned": 2,
"maximum": 3,
"suggestions": [
{
"message": "Expected 4 items in status/nodes but found 2"
}
],
"errors": []
},
"my_custom_tests": {
"description": "This test verifies that the created service is running correctly",
"earned": 1,
"maximum": 1,
"suggestions": [],
"errors": []
},
}
```

This JSON output would make it simple for others to create scorecard plugins while keeping it simple for the scorecard
to parse and integrate with the other tests. The above JSON design is based on the `TestResult` type that will be implemented
by PR [#994](https://github.com/operator-framework/operator-sdk/pull/994).