Skip to content

Commit 379b59d

Browse files
Describing steps to support out-of-tree
1 parent 1d98289 commit 379b59d

File tree

1 file changed

+160
-0
lines changed

1 file changed

+160
-0
lines changed
+160
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
---
2+
title: neat-enhancement-idea
3+
authors:
4+
- "@Danil-Grigorev"
5+
reviewers:
6+
- "@enxebre"
7+
- "@elmiko"
8+
- "@mfedosin"
9+
approvers:
10+
- "@enxebre"
11+
- "@elmiko"
12+
- TBD
13+
creation-date: 2020-08-31
14+
last-updated: 2020-09-01
15+
status: implementable
16+
see-also:
17+
- "/enhancements/cloud-controller-manager/openstack-cloud-controller-manager.md"
18+
replaces:
19+
superseded-by:
20+
---
21+
22+
# Add support for out-of-tree cloud provider integration
23+
24+
## Release Signoff Checklist
25+
26+
- [x] Enhancement is `implementable`
27+
- [ ] Design details are appropriately documented from clear requirements
28+
- [ ] Test plan is defined
29+
- [ ] Graduation criteria for dev preview, tech preview, GA
30+
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)
31+
32+
## Open Questions
33+
34+
TBD
35+
36+
## Summary
37+
38+
This enhancement proposal describes steps needed to add support for out-of-tree providers in cloud providers currently supported by Openshift. This includes: AWS, Azure, GCP, vSphere, IBM. Openstack provider proposal is located at it's dedicated document.
39+
40+
## Motivation
41+
42+
Upstream is currently moving towards exclusion of in-tree cloud-provider code from `k/k` repository, and instead introducing out-of-tree provider support, making them independent from `k/k` releases, and simplifying the process of development and e2e testing for those plugins. Previously each cloud-provider was located in the [`legacy-cloud-providers`](https://github.com/kubernetes/kubernetes/tree/master/staging/src/k8s.io/legacy-cloud-providers) and now the support and main development is done in the dedicated cloud-provider [repositories](https://github.com/kubernetes?q=cloud-provider-). Upstream is planning to remove in-tree support completely in the future releases, approximetly in [v1.21](https://kubernetes.slack.com/archives/CPPU7NY8Y/p1589289501020600).
43+
44+
### Goals
45+
46+
- Prepare openshift components to accomodate the out-of-tree plugins and remove support for in-tree implementation.
47+
- Add an installer option to select an`external` cloud provider.
48+
- Deploy cluster infrastructure accordingly to selected `external` option for cloud provider, and allow `cloud-controller` component registration via the provider specific upstream implemetation.
49+
50+
### Non-Goals
51+
52+
- Force immediate exclusion of in-tree support for currently used providers, as their out-of-tree counterparts are added.
53+
- Introduce other approaches to rollout IPI cluster with `kind` or `cluster-api`.
54+
55+
## Proposal
56+
57+
Add an out-of-tree support based on current IPI, and make a seamless transition for all currently supported providers.
58+
59+
This change will be gradually executed on all suported cloud providers: AWS, Azure, GCP, vSphere and IBM.
60+
61+
### User Stories
62+
63+
As a cloud developer, I’d like to improve support for cloud features for openshift
64+
65+
#### Story 1:
66+
67+
We’d like to deploy control-plane node on a spot instance, to allowing customers to get additional cost reduction on their clusters. In order to introduce and test this feature, we have to implement this in the kubernetes repository, first, then wait for the next release to get a “free” rebase on the feature. We are also bound to the kubernetes release cycles, making it hard to achieve the goal in case the upstream is already going through the feature freeze. Starting downstream is not an option, as the implementation could be rejected upstream or gain less attention then needs to quickly get it into the release, diverging our fork from the source.
68+
69+
#### Story 2:
70+
71+
As an openshift developer I’m assigned a release blocker BZ regarding upstream implementation. Proving the value for merging this and making the bug a release blocker upstream requires careful communication with upstream maintainers, and explaining the value to the people whom are not necessarily involved into internal details, so their opinion could differ and delay merging an important fix.
72+
73+
#### Story 3:
74+
75+
We desire to extend our team responsibilities on upstream cloud providers in the future and gain weight in promoting features into upstream. Getting such for kubernetes repository is harder for us and maintainers, as this would mean giving weight for approving and merging features for the parts of the kubernetes project, not necessarily laying under our responsibilities and scope of knowledge.
76+
77+
#### Story 4:
78+
79+
We’d like to discuss technical details related to a specific cloud in a SIG meeting with people who are also involved into development in this domain, and that way gain useful insights into the cloud infrastructure, improve the overall quality of our features, stay on top of the new features, and improve the relations with maintainers outside of our company, which nevertheless share with us a common goal.
80+
81+
82+
### Implementation Details/Notes/Constraints
83+
84+
The `external CCM` implementation will be hosted within openshift repository in https://github.com/cluster-cloud-controller-manager-operator. That repository will manage `CCM` provisioning for metioned cloud providers + `Openstack` out-of-tree provider implementation.
85+
86+
`Cloud-controller` will be provisioned in the `openshift-cloud-controller` namespace.
87+
88+
Preparation steps, common for all providers:
89+
- Add support for recognition of the `external` option in library-go: https://github.com/openshift/library-go/blob/master/pkg/operator/configobserver/cloudprovider/observe_cloudprovider.go#L154 which will disable config parsing and add `external-cloud-config` option in the `Infrastructure` resource, which will be only used for external providers.
90+
- Port the templates of `cloud-controller` from upstream repositories into proposed https://github.com/openshift/cluster-cloud-controller-manager-operator repository. Those templates will belong to newly created `openshift-cloud-controller` namespace.
91+
92+
Each provider implementation will need to do the following minimal set of actions in order to make out-of-tree implemetation work:
93+
- Add `external` flag to `kube-controller-manager` pod instead of the previously used provider. Remove the `cloud-config` option from the template.
94+
- Add `external` flag to `kubelet`.
95+
- Deploy a `cloud-controller` `DaemonSet` in the `openshift-cloud-controller` namespace.
96+
97+
## Design Details
98+
99+
#### Upgrade/Downgrade strategy
100+
101+
Upstream [proposal](https://github.com/kubernetes/enhancements/blob/473d6a094f07e399079d1ce4698d6e52ddcc1567/keps/sig-cloud-provider/20190422-cloud-controller-manager-migration.md#motivation) describes main pinpoints for migration between in-tree and out-of-tree for HA clusters. They propose implementation for cross-namespace leadership delegation, which would help Openshift to preserve HA during upgrages.
102+
103+
In case the cluster does not require to be HA, then upgrading under CVO management, which will lead to two simultanious replicas of `cloud-controller`s running at the same time at some moment is not a problem.
104+
105+
#### Examples
106+
107+
##### AWS
108+
109+
Main repository: https://github.com/kubernetes/cloud-provider-aws/
110+
Sample manifests: https://github.com/kubernetes/cloud-provider-aws/tree/master/manifests
111+
112+
##### Azure
113+
114+
Main repository: https://github.com/kubernetes-sigs/cloud-provider-azure
115+
Sample manifests: https://github.com/kubernetes-sigs/cloud-provider-azure/tree/master/examples/out-of-tree
116+
117+
##### GCP
118+
119+
Main repository: https://github.com/kubernetes/cloud-provider-gcp
120+
Sample manifests: https://github.com/kubernetes/cloud-provider-gcp/tree/master/deploy
121+
122+
##### vSphere
123+
124+
Main repository: https://github.com/kubernetes/cloud-provider-vsphere
125+
Sample manifests: https://github.com/kubernetes/cloud-provider-vsphere/tree/master/manifests/controller-manager
126+
127+
##### Demo
128+
129+
- Running vSphere `cloud-controller` out-of-tree: TBD
130+
131+
### Test Plan
132+
133+
- Each provider at the time of migration should have a working CI implemetation, which will assist in testing the provisioning with out-of-tree support enabled. This requeires vSphere CI to be up and running and upgrade support for other providers.
134+
- Ensure that transition between in-tree and out-of-tree will be handled by the `Infrastracture` `external-cloud-config` field being set.
135+
- TBD
136+
137+
##### Removing a deprecated feature
138+
139+
Upstream is currently planning to remove support for in-tree providers with OCP-4.8 (kubernetes 1.21) - based on the slack [conversation](https://kubernetes.slack.com/archives/CPPU7NY8Y/p1589289501020600). The enchancement [describes](https://github.com/kubernetes/enhancements/blob/473d6a094f07e399079d1ce4698d6e52ddcc1567/keps/sig-cloud-provider/20190125-removing-in-tree-providers.md) steps towards this process. Our goal is to have a working alternative for in-tree providers ready to be used on-par with old implementation befor the upstream release will remove this feature from main `k/k` repository.
140+
141+
## Timeline
142+
143+
Follow upstream discussions and implementation/releases
144+
145+
## Kubernetes 1.18
146+
147+
- vSphere support graduaded to beta: https://github.com/kubernetes/enhancements/issues/670
148+
149+
## Kubernetes 1.19
150+
151+
- Azure support goes into beta: https://github.com/kubernetes/enhancements/issues/667
152+
153+
## Infrastructure Needed
154+
155+
(Possibly) Forks for out-of-tree provider repositories:
156+
- [AWS](https://github.com/kubernetes/cloud-provider-aws)
157+
- [GCP](https://github.com/kubernetes/cloud-provider-gcp)
158+
- [Azure](https://github.com/kubernetes-sigs/cloud-provider-azure)
159+
- [Vsphere](https://github.com/kubernetes-sigs/cloud-provider-vsphere)
160+

0 commit comments

Comments
 (0)