Skip to content

Deployment error handling was reworked #272

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

istalker2
Copy link
Contributor

@istalker2 istalker2 commented May 29, 2017

In some cases deployment could hang forever:

  • if graph vertex depends on vertex which will never be created
    (because of timeout or permanent error)
  • if graph vertex was set to be created only if parent fails, but it
    didn't
    Because deployment algorithm waits for all vertexes to be created if
    any one of them remained blocked, Deploy() is going to run forever
    blocking AC process from handling other deployment tasks.

Also on-error processing could be triggered by intermediate resource
status. For example it could happen, if resource status was obtained
prior to resource.Create() call. Another case if resource was set to
have several deployment attempts. If the first attempt fails on-error
dependency becomes activated, but on the second attempt the deployment
may succeed.

This commit reworks error handling:

  • Resources which cannot be created or time out, marked with error.
  • Resources, that depend on failed resources also fail
  • Thus all graph vertexes eventually become unblocked and deployment
    finishes
  • on-error handling is done based on the final resource status/error
  • Deploy() now returns true if deployment succeeded. Deployment fails
    if any of resources (and their dependents) went into failed state
    except for cases, where they were skipped because all dependencies
    had on-error meta and parent resource didn't fail

Also:

  • e2e tests were updated so that most of them wait for deployment to
    finish rather than just waiting for resource status. Thus now they
    also test that deployment doesn't hang
  • Graph vertex type (ScheduledResource) and it fields are not exported
    anymore. The same goes for some of dependency graph methods.
  • wait() method doesn't create unnecessary goroutines and channels

This change is Reviewable

Stan Lagun added 4 commits June 13, 2017 16:10
This change adds ability to replicate dependency with index parameters
iterated over arbitrary number of lists.

For each dependency it is now possible to specify map of
indexVariableName -> listExpression

listExpression := range|item + [, listExpression]
range := number '..' number
item := STRING

for example, if for "i: 1..3" the dependency will be replicated into 3
clones, each one of them having argument i set to value in range [1, 3]

This also allows to consume N flow replicas by replicating the
dependency that leads to the consumed flow
For sequential flow, each next replica is attached to the leafs
of previous one so that they will be deployed sequentially
* stopChan is now passed to the graph finalizers so that deployment
  can be canceled on the final stages
* never write to stopChan. The only correct way to cancel deployment is
  to close the channel
* pass nil instead of real chanel for unit tests that do not cancel
  deployment
In some cases deployment could hang forever:
* if graph vertex depends on vertex which will never be created
  (because of timeout or permanent error)
* if graph vertex was set to be created only if parent fails, but it
  didn't
Because deployment algorithm waits for all vertexes to be created if
any one of them remained blocked, Deploy() is going to run forever
blocking AC process from handling other deployment tasks.

Also on-error processing could be triggered by intermediate resource
status. For example it could happen, if resource status was obtained
prior to resource.Create() call. Another case if resource was set to
have several deployment attempts. If the first attempt fails on-error
dependency becomes activated, but on the second attempt the deployment
may succeed.

This commit reworks error handling:
* Resources which cannot be created or time out, marked with error.
* Resources, that depend on failed resources also fail
* Thus all graph vertexes eventually become unblocked and deployment
  finishes
* on-error handling is done based on the final resource status/error
* Deploy() now returns true if deployment succeeded. Deployment fails
  if any of resources (and their dependents) went into failed state
  except for cases, where they were skipped because all dependencies
  had on-error meta and parent resource didn't fail

Also:
* e2e tests were updated so that most of them wait for deployment to
  finish rather than just waiting for resource status. Thus now they
  also test that deployment doesn't hang
* Graph vertex type (ScheduledResource) and it fields are not exported
  anymore. The same goes for some of dependency graph methods.
* wait() method doesn't create unnecessary goroutines and channels
@istalker2 istalker2 force-pushed the unblock-deployment branch from 5730a94 to 06a4167 Compare June 13, 2017 23:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants