Skip to content

Commit b08037c

Browse files
authored
Node Agent Configuration design and design template (#1101)
New configuration for the file system backup & restore design that will be used by the OADP and allow to choose Restic or Kopia uploader type. _template.md borrowed from the Upstream vmware-tanzu/velero/main/design/_template.md Signed-off-by: Michal Pryc <[email protected]>
1 parent b5c4404 commit b08037c

File tree

2 files changed

+278
-0
lines changed

2 files changed

+278
-0
lines changed

docs/design/_template.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
# Design proposal template `<replace with your proposal's title>`
2+
3+
_Note_: The preferred style for design documents is one sentence per line.
4+
*Do not wrap lines*.
5+
This aids in review of the document as changes to a line are not obscured by the reflowing those changes caused and has a side effect of avoiding debate about one or two space after a period.
6+
7+
_Note_: The name of the file should follow the name pattern `<short meaningful words joined by '-'>_design.md`, e.g:
8+
`listener-design.md`.
9+
10+
## Abstract
11+
One to two sentences that describes the goal of this proposal and the problem being solved by the proposed change.
12+
The reader should be able to tell by the title, and the opening paragraph, if this document is relevant to them.
13+
14+
## Background
15+
One to two paragraphs of exposition to set the context for this proposal.
16+
17+
## Goals
18+
- A short list of things which will be accomplished by implementing this proposal.
19+
- Two things is ok.
20+
- Three is pushing it.
21+
- More than three goals suggests that the proposal's scope is too large.
22+
23+
## Non Goals
24+
- A short list of items which are:
25+
- a. out of scope
26+
- b. follow on items which are deliberately excluded from this proposal.
27+
28+
29+
## High-Level Design
30+
One to two paragraphs that describe the high level changes that will be made to implement this proposal.
31+
32+
## Detailed Design
33+
A detailed design describing how the changes to the product should be made.
34+
35+
The names of types, fields, interfaces, and methods should be agreed on here, not debated in code review.
36+
The same applies to changes in CRDs, YAML examples, and so on.
37+
38+
Ideally the changes should be made in sequence so that the work required to implement this design can be done incrementally, possibly in parallel.
39+
40+
## Alternatives Considered
41+
If there are alternative high level or detailed designs that were not pursued they should be called out here with a brief explanation of why they were not pursued.
42+
43+
## Security Considerations
44+
If this proposal has an impact to the security of the product, its users, or data stored or transmitted via the product, they must be addressed here.
45+
46+
## Compatibility
47+
A discussion of any compatibility issues that need to be considered
48+
49+
## Implementation
50+
A description of the implementation, timelines, and any resources that have agreed to contribute.
51+
52+
## Open Issues
53+
A discussion of issues relating to this proposal for which the author does not know the solution. This section may be omitted if there are none.
Lines changed: 225 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,225 @@
1+
# NodeAgentConfig configuration with restic/kopia
2+
Date: 2023-07-25
3+
4+
## Abstract
5+
6+
New configuration for the file system backup & restore that will be used by the OADP and allow to choose Restic or Kopia uploader type.
7+
8+
## Background
9+
10+
For the file system backup and restore the Velero may use Restic or Kopia as the uploader mechanism. A number of tests were [performed](https://velero.io/docs/main/performance-guidance/) by the Velero community to compare those mechanisms. In many cases Kopia uploader is a much better performing mechanism and as such should be added to the OADP operator.
11+
12+
Please refer to the upstream [kopia uploader integration design](https://github.com/vmware-tanzu/velero/blob/main/design/unified-repo-and-kopia-integration/unified-repo-and-kopia-integration.md) for underlying Velero design and backup & restore workflow.
13+
14+
Upstream [kopia](https://github.com/kopia/kopia) project which is used by Velero and configured by the OADP as this design proposes.
15+
16+
## Goals
17+
18+
- A new option `nodeAgentConfig` to allow configuration of restic or kopia uploader
19+
- Backwards compatibility with the current OADP configuration schema options, namely `restic`
20+
- Preparation for the future deprecation of the current `restic` configuration option (from OADP 1.4+)
21+
- Allow new schema(s) to be used by the datamover node agent
22+
- Enablment of datamover node agent
23+
- Deprecation of the `restic` configuration option
24+
- New environment option `FS_PV_HOSTPATH` that is used as a replacement for `RESTIC_PV_HOSTPATH`. See [Compatibility](#compatibility) section for more details.
25+
- Removal of the `restic-restore-action-config` ConfigMap with direct replacement by `fs-restore-action-config`. See [Compatibility](#compatibility) section for more details.
26+
27+
## Non Goals
28+
- Removal of the `restic` configuration option
29+
- Removal of the `RESTIC_PV_HOSTPATH` environment option
30+
- Support for the downgreade of OADP operator with new configuration options
31+
- E2E tests for the `kopia` or `restic` uploader, however they should be added in the near future to cover dpa testing of the new fields and we need backup/restore e2e tests which test both kopia and restic (and datamover eventually) using this new struct.
32+
33+
## High-Level Design
34+
35+
Since new `nodeAgentConfig` configuration option is a sibling of the `restic` one, the new common structure `NodeAgentCommonFields` will be created which will be exactly the same data structure as current `ResticConfig` and will be used by both `Restic` and the new `NodeAgentConfig`. The only difference between `NodeAgentConfig` and `Restic` is an addition of one `UploaderType` option to the `NodeAgentConfig` that will be either `kopia` or `restic`.
36+
37+
When `nodeAgentConfig` is used, the `UploaderType` option is a required one, so the user have to select either `kopia` or `restic`.
38+
39+
## Detailed Design
40+
41+
42+
### New data structures
43+
44+
A new structure will be added, that includes the ResticConfig inline and extends this with one new parameter `UploaderType`:
45+
46+
```go
47+
type NodeAgentConfig struct {
48+
// Embedding NodeAgentCommonFields
49+
// +optional
50+
NodeAgentCommonFields `json:",inline"`
51+
52+
// The type of uploader to transfer the data of pod volumes, the supported values are 'restic' or 'kopia'
53+
// +kubebuilder:validation:Enum=restic;kopia
54+
// +kubebuilder:validation:Required
55+
UploaderType string `json:"uploaderType"`
56+
}
57+
```
58+
59+
The `NodeAgentCommonFields` structure is 1-1 as the current `Restic` structure.
60+
61+
```go
62+
type NodeAgentCommonFields struct {
63+
// enable defines a boolean pointer whether we want the daemonset to
64+
// exist or not
65+
// +optional
66+
Enable *bool `json:"enable,omitempty"`
67+
// supplementalGroups defines the linux groups to be applied to the NodeAgent Pod
68+
// +optional
69+
SupplementalGroups []int64 `json:"supplementalGroups,omitempty"`
70+
// timeout defines the NodeAgent timeout, default value is 1h
71+
// +optional
72+
Timeout string `json:"timeout,omitempty"`
73+
// Pod specific configuration
74+
PodConfig *PodConfig `json:"podConfig,omitempty"`
75+
}
76+
```
77+
78+
Current `Restic` structure will embedd the `NodeAgentCommonFields` without any additional options or changes.
79+
80+
The above `NodeAgentConfig` is a member of `ApplicationConfig`, which already includes `ResticConfig`, however we do not replace the `ResticConfig`, instead for backwards compatibility we add new `NodeAgentConfig` parameter:
81+
82+
```go
83+
// ApplicationConfig defines the configuration for the Data Protection Application
84+
type ApplicationConfig struct {
85+
Velero *VeleroConfig `json:"velero,omitempty"`
86+
// (deprecation warning) ResticConfig is the configuration for restic server.
87+
// Restic is for backwards compatibility and is replaced by the nodeAgentConfig
88+
// Restic will be removed with the OADP 1.4
89+
// +kubebuilder:deprecatedversion:warning=1.3
90+
// +optional
91+
Restic *ResticConfig `json:"restic,omitempty"`
92+
93+
// NodeAgentConfig is needed to allow selection between kopia or restic
94+
// +kubebuilder:validation:Optional
95+
NodeAgent *NodeAgentConfig `json:"nodeAgentConfig,omitempty"`
96+
}
97+
```
98+
99+
### Configuration YAML options
100+
101+
The `nodeAgentConfig` configuration options and `restic` configuration options will be exactly the same and will contain same options as the current `restic` schema with additional `uploaderType` field under `nodeAgentConfig`. The part of YAML that presents the additions:
102+
103+
```yaml
104+
configuration:
105+
description: configuration is used to configure the data protection application's server config
106+
properties:
107+
108+
nodeAgentConfig:
109+
description: NodeAgent is needed to allow selection between kopia or restic
110+
properties:
111+
112+
[...] // <Same as all current restic configuration options>
113+
114+
uploaderType:
115+
description: The type of uploader to transfer the data of pod volumes, the supported values are 'restic' or 'kopia'
116+
enum:
117+
- restic
118+
- kopia
119+
type: string
120+
type: object
121+
122+
restic:
123+
description: (deprecation warning) Restic is for backwards compatibility and will be removed with the OADP 1.4+. Use nodeAgentConfig instead.
124+
properties:
125+
126+
[...] // <Same as all current restic configuration options>
127+
128+
```
129+
130+
### Validations
131+
132+
It is important to disallow user from using both options `restic` and `nodeAgentConfig` in OADP 1.3 together (error state).
133+
134+
### Deprecation of the `restic` coiguration option in OADP 1.3 and it's future removal
135+
136+
The `restic` configuration option will be deprecated in OADP 1.3. It will be removed in a future OADP release. However, restic backup functionality is still fully-supported, but restic users are encouraged to use the new `nodeAgentConfig` struct instead so that they won't be impacted on upgrade when the legacy struct is removed in a future release.
137+
138+
There were few alternatices considered (see [Alternatives Considered](#alternatives-considered)). We will have three places where the deprecation information will be presented to the user:
139+
1. Description of the `resic` property will have the deprecation warning
140+
2. If the `restic` is used, the DPA event will contain a `warning` message to inform user that the `restic` is deprecated, that will appear in the DPA `Events`.
141+
3. If the `restic` is used, the application log will have `warning` message to inform user that the `restic` is deprecated
142+
4. Release notes for OADP 1.3 will contain information about new configuration option `nodeAgentConfig` that should be used instead of `restic`
143+
144+
145+
## Alternatives Considered
146+
147+
- leave `restic` as the only option and do not allow to use `kopia`
148+
- remove the `restic` and use `kopia` as the default and only available option
149+
150+
### Schema structure
151+
152+
There were alternative schema structure for the `Restic` and new `NodeAgentConfig` considered, however because structures may be directly used by other `go` applications, outside of the API schema, the decision was to use structure as described in the [New data structures](#new-data-structures) section:
153+
- Keeping `Restic` as is and including inline into `NodeAgentConfig`
154+
- Duplicating all the fields from `Restic` within `NodeAgentConfig`
155+
- Moving all the fields from the `Restic` to the `NodeAgentConfig` and ignoring new field which will appear in the `Restic`
156+
157+
### Deprecation warnings
158+
For informing user about deprecation of the `restic` we considered few additional options:
159+
- Embedding `warning` within the OpenShift console. This would be the nicest way as the push notifications will appear in the main OCP console, however it would require bigger implementation effort.
160+
161+
- Creating custom information on the DPA object itself, so whenever user would describe created object that had the `restic` the deprecation warning would appear in the `events` section. That option also requires publishing custom `events` fromt he reconcile function and still requires user to pull that information rather then push method.
162+
163+
- Using kubebuilder annotation
164+
```
165+
// +kubebuilder:deprecatedversion:warning=<string>
166+
```
167+
This option is not for one particular onfiguration field, but entire CRD, which is outside of our desired needs as we do not want to deprecate entire CRD and create a new version of it, just one field rename.
168+
169+
- Adding additional Reconcile condition with warning message
170+
Currently there are two `Reasons` that the main DPA reconcail status may have:
171+
```
172+
const ReconciledReasonComplete = "Complete"
173+
const ReconciledReasonError = "Error"
174+
```
175+
176+
There is also message that is attached for each of this `Reason`. For an `Error` reason there is propagated error message and for the `Complete` reason there is one string, which is:
177+
```
178+
const ReconcileCompleteMessage = "Reconcile complete"
179+
```
180+
181+
Additional Reconcile reason, which would have it's own message could be added, that will be similar to `Complete`, so the operator is functioning properly, however there are some messages that require user attention.
182+
```
183+
const ReconciledReasonWarning = "Warning"
184+
```
185+
186+
This however would require refactoring of the `Reconcile` functions, and that would require separate design doc and separate implementation.
187+
188+
## Security Considerations
189+
190+
The enablement of an extra upload mechanism could potentially introduce security implications due to the integration of a new backend, which might contain vulnerabilities.
191+
192+
## Compatibility
193+
194+
### Current `restic` configuration option
195+
196+
This design does not change current configuration options, however it proposes addition of the deprecation warning to the current `Restic` schema. The `Restic` configuration option will be removed from OADP 1.4.
197+
198+
### RESTIC_PV_HOSTPATH environment option
199+
200+
This design does not remove `RESTIC_PV_HOSTPATH` used to redefine the default `/var/lib/kubelet/pods`, however it adds the environment variable `FS_PV_HOSTPATH` that may be used the same way as `RESTIC_PV_HOSTPATH`.
201+
202+
The `RESTIC_PV_HOSTPATH` takes precedence over `FS_PV_HOSTPATH`
203+
204+
### Replacing `restic-restore-action-config` with `fs-restore-action-config`
205+
206+
The `restic-restore-action-config` was [removed](https://github.com/vmware-tanzu/helm-charts/commit/7484d8e8365ab91da698e5b5d6346153cde30af4) in the Velero v1.10.0 it should also be removed and replaced with the [fs-restore-action-config](https://github.com/vmware-tanzu/helm-charts/blob/c15d13a916c255c93ef48e8bf9c59a3ba198d5ca/charts/velero/templates/NOTES.txt#L62).
207+
208+
## Implementation
209+
210+
Implementation for the new data structure will follow the [New data structures](#new-data-structures) design.
211+
212+
There are two uses of the additional `--uploader-type` that will specify restic or kopia uploader type.
213+
214+
- One within call to the method from the `github.com/vmware-tanzu/velero` package:
215+
```go
216+
func Deployment(namespace string, opts ...podTemplateOption) *appsv1.Deployment
217+
```
218+
219+
- Second is within the OADP `pkg/velero/server/args.go` that is used for testing and development purposes.
220+
221+
We will also rename the `restic.go` module to be named `nodeagent.go` and relevant functions within the `restic.go`, so they are more generic.
222+
223+
## Open Issues
224+
225+
- n/a

0 commit comments

Comments
 (0)