-
Notifications
You must be signed in to change notification settings - Fork 1k
End 2 End tests speedup #1180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
End 2 End tests speedup #1180
Conversation
* Lazy upgrade now properly covered with eventual and waiting for pod start * patching config now updates deployment, patching annotation, allowing to trace change step * run.sh no takes NOCLEANUP to stop kind from being deleted * if kind config is present, run will not install kind * Fast e2e local execution now possible once kind is up
e2e/tests/test_e2e.py
Outdated
# HACK operator must register CRD and/or Sync existing PG clusters after start up | ||
# for local execution ~ 10 seconds suffices | ||
time.sleep(60) | ||
self. wait_for_pod_start("name=postgres-operator") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And how do we know then at the line 109 that it is safe to create PG clusters ? By re-creating a PG cluster until it succeeds ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you worry the operator is not running? and then initial manifest create is not picked up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might happen that the PG CRD is not yet registered. There is a very short period of time when an operator pod has already started by the CRD is not yet registered. That sometimed led to the failure of cluster creation during the initial development of e2e tests
* Allow Docker image to take parameters to overwrite unittest execution * Add documentation for running individual tests * Fixed String encoding in Patorni state check and error case
…bject diffs to text diff. Enabled padding for log level.
@@ -199,7 +199,7 @@ type Config struct { | |||
|
|||
// MustMarshal marshals the config or panics | |||
func (c Config) MustMarshal() string { | |||
b, err := json.MarshalIndent(c, "", "\t") | |||
b, err := json.MarshalIndent(c, "", " ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why ? imo \t
is more readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\t gets serialized to \t i think so makes it not readable, you can try after PR to change it.
// limitations under the License. | ||
|
||
// Package diff implements a linewise diff algorithm. | ||
package nicediff |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it godebug copied verbatim ?
|
||
// disabling the sending of events also to the logoutput | ||
// the operator currently duplicates a lot of log entries with this setup | ||
// eventBroadcaster.StartLogging(logger.Infof) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dead code ?
pkg/cluster/util.go
Outdated
n, errn := json.MarshalIndent(new, "", " ") | ||
|
||
if erro != nil || errn != nil { | ||
panic("could not marschal API objects, should not happen") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a typo: marsChal
is actually marshal
|
||
cmp := c.compareStatefulSetWith(desiredSS) | ||
if !cmp.match { | ||
if cmp.rollingUpdate && !podsRollingUpdateRequired { | ||
podsRollingUpdateRequired = true | ||
c.setRollingUpdateFlagForStatefulSet(desiredSS, podsRollingUpdateRequired) | ||
c.setRollingUpdateFlagForStatefulSet(desiredSS, podsRollingUpdateRequired, "statefulset changes") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"statefulset changes" is not particularly descriptive message. Rolling upgrade happens exactly because something in the stateful set template changes. What is intended to be described here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is one of three places i added just the reason, the other 2 were cache or existing annotation. Yes could be better message.
|
||
After having executed a normal E2E run with `NOCLEANUP=True` Kind still continues to run, allowing you subsequent test runs. | ||
|
||
To run an individual test, run the following command in the `e2e` directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may or may not work depending on the state in which the last test to run leaves the cluster.
It is assumed that individual unit tests will properly clean up after themselves, but that is not given.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are correct. This does not work when stuff is broken. But we cant fix it all now. Future work to have e2e tests not be so mixed up and depending on each other.
Most are easy, bit more annoying to fix are those nodes label and toleration changes.
@@ -0,0 +1,2 @@ | |||
#!/bin/bash | |||
kubectl logs $(kubectl get pods -l name=postgres-operator --field-selector status.phase=Running -o jsonpath='{.items..metadata.name}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can that command be wrapped into a function so that
source e2e/scripts/get_logs.sh
would work ?
kubectl delete statefulsets -l application=spilo,cluster-name=acid-minimal-cluster | ||
kubectl delete services -l application=spilo,cluster-name=acid-minimal-cluster | ||
kubectl delete configmap postgres-operator | ||
kubectl delete deployment postgres-operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be subject to a race condition where an operator is deleted earlier that it completes the clean up. That happened earlier both in this e2e pipeline and the run_operator_locally
script.
@@ -0,0 +1,14 @@ | |||
#!/bin/bash | |||
|
|||
export cluster_name="postgres-operator-e2e-tests" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does a docker run
wrapper export anything to the env ?
pkg/cluster/cluster.go
Outdated
@@ -595,7 +595,7 @@ func (c *Cluster) enforceMinResourceLimits(spec *acidv1.PostgresSpec) error { | |||
return fmt.Errorf("could not compare defined memory limit %s with configured minimum value %s: %v", memoryLimit, minMemoryLimit, err) | |||
} | |||
if isSmaller { | |||
c.logger.Warningf("defined memory limit %s is below required minimum %s and will be set to it", memoryLimit, minMemoryLimit) | |||
c.logger.Warningf("defined memory limit %s is below required minimum %s and will be increase", memoryLimit, minMemoryLimit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a typo: increaseD
👍 |
👍 |
1 similar comment
👍 |
* Improving end 2 end tests, especially speed of execution and error, by implementing proper eventual asserts and timeouts. * Add documentation for running individual tests * Fixed String encoding in Patorni state check and error case * Printing config as multi log line entity, makes it readable and grepable on startup * Cosmetic changes to logs. Removed quotes from diff. Move all object diffs to text diff. Enabled padding for log level. * Mount script with tools for easy logaccess and watching objects. * Set proper update strategy for Postgres operator deployment. * Move long running test to end. Move pooler test to new functions. * Remove quote from valid K8s identifiers.
This reverts commit 125f6dd.
The current implementation of our end to end tests is very slow due to a lot of hard coded waits and default waits.
This PR introduces
eventual
wrappers aroundassertEqual
andassertTrue
to speedup the watch process for operator actions. These functions repeat the requests multiple times and either succeed once the result is correct or fail after max intervals.This PR also increases resource limits for pods, in case an environment applies CPU limits.
This PR also sets a discovery folder for
unittest
totests
.More work needs to be done in the future to assess all end 2 end test cases to properly wait/retry.