- Setting up
- Developing
- Install go
- Get the latest patch version for go v1.16.
- Install jq
brew install jq
on macOS.chocolatey install jq
on Windows.sudo apt install jq
on Ubuntu Linux.
- Install gettext package
brew install gettext && brew link --force gettext
on macOS.- install instructions on Windows.
sudo apt install gettext
on Ubuntu Linux.
- Install KIND
GO111MODULE="on" go get sigs.k8s.io/[email protected]
.
- Install Kustomize
brew install kustomize
on macOS.choco install kustomize
on Windows.- install instructions on Linux
- Install Python 3.x or 2.7.x, if neither is already installed.
- Install make.
- Install timeout
brew install coreutils
on macOS.
go get -d sigs.k8s.io/cluster-api-provider-azure
cd "$(go env GOPATH)/src/sigs.k8s.io/cluster-api-provider-azure"
This provider is modeled after the upstream Cluster API project. To get familiar with Cluster API resources, concepts and conventions (such as CAPI and CAPZ), refer to the Cluster API Book.
Part of running cluster-api-provider-azure is generating manifests to run. Generating dev manifests allows you to test dev images instead of the default releases.
Any public container registry can be leveraged for storing cluster-api-provider-azure container images.
Change some code!
This repositories uses Go Modules to track and vendor dependencies.
To pin a new dependency:
- Run
go get <repository>@<version>
. - (Optional) Add a
replace
statement ingo.mod
.
Makefile targets and scripts are offered to work with go modules:
make verify-modules
checks whether go module files are out of date.make modules
runsgo mod tidy
to ensure proper vendoring.hack/ensure-go.sh
checks that the Go version and environment variables are properly set.
Your environment must have the Azure credentials as outlined in the getting started prerequisites section.
Install Tilt:
brew install tilt-dev/tap/tilt
on macOS or Linuxscoop bucket add tilt-dev https://github.com/tilt-dev/scoop-bucket
&scoop install tilt
on Windows- for alternatives you can follow the installation instruction for macOS, Linux or Windows
After the installation is done, verify that you have installed it correctly with: tilt version
Install Helm:
brew install helm
on MacOSchoco install kubernetes-helm
on Windows- Install Instruction on Linux
You would require installation of Helm for succesfully setting up Tilt.
Both of the Tilt setups below will get you started developing CAPZ in a local kind cluster. The main difference is the number of components you will build from source and the scope of the changes you'd like to make. If you only want to make changes in CAPZ, then follow CAPZ instructions. This will save you from having to build all of the images for CAPI, which can take a while. If the scope of your development will span both CAPZ and CAPI, then follow the CAPI and CAPZ instructions.
If you want to develop in CAPZ and get a local development cluster working quickly, this is the path for you.
From the root of the CAPZ repository and after configuring the environment variables, you can run the following to generate your tilt-settings.json
file:
cat <<EOF > tilt-settings.json
{
"kustomize_substitutions": {
"AZURE_SUBSCRIPTION_ID_B64": "$(echo "${AZURE_SUBSCRIPTION_ID}" | tr -d '\n' | base64 | tr -d '\n')",
"AZURE_TENANT_ID_B64": "$(echo "${AZURE_TENANT_ID}" | tr -d '\n' | base64 | tr -d '\n')",
"AZURE_CLIENT_SECRET_B64": "$(echo "${AZURE_CLIENT_SECRET}" | tr -d '\n' | base64 | tr -d '\n')",
"AZURE_CLIENT_ID_B64": "$(echo "${AZURE_CLIENT_ID}" | tr -d '\n' | base64 | tr -d '\n')",
}
}
EOF
To build a kind cluster and start Tilt, just run:
make tilt-up
By default, the Cluster API components deployed by Tilt have experimental features turned off.
If you would like to enable these features, add extra_args
as specified in The Cluster API Book.
Once your kind management cluster is up and running, you can deploy a workload cluster.
You can also deploy a flavor cluster as a local tilt resource.
To tear down the kind cluster built by the command above, just run:
make kind-reset
If you want to develop in both CAPI and CAPZ at the same time, then this is the path for you.
To use Tilt for a simplified development workflow, follow the instructions in the cluster-api repo. The instructions will walk you through cloning the Cluster API (CAPI) repository and configuring Tilt to use kind
to deploy the cluster api management components.
you may wish to checkout out the correct version of CAPI to match the version used in CAPZ
Note that tilt up
will be run from the cluster-api repository
directory and the tilt-settings.json
file will point back to the cluster-api-provider-azure
repository directory. Any changes you make to the source code in cluster-api
or cluster-api-provider-azure
repositories will automatically redeployed to the kind
cluster.
After you have cloned both repositories, your folder structure should look like:
|-- src/cluster-api-provider-azure
|-- src/cluster-api (run `tilt up` here)
After configuring the environment variables, run the following to generate your tilt-settings.json
file:
cat <<EOF > tilt-settings.json
{
"default_registry": "${REGISTRY}",
"provider_repos": ["../cluster-api-provider-azure"],
"enable_providers": ["azure", "docker", "kubeadm-bootstrap", "kubeadm-control-plane"],
"kustomize_substitutions": {
"AZURE_SUBSCRIPTION_ID_B64": "$(echo "${AZURE_SUBSCRIPTION_ID}" | tr -d '\n' | base64 | tr -d '\n')",
"AZURE_TENANT_ID_B64": "$(echo "${AZURE_TENANT_ID}" | tr -d '\n' | base64 | tr -d '\n')",
"AZURE_CLIENT_SECRET_B64": "$(echo "${AZURE_CLIENT_SECRET}" | tr -d '\n' | base64 | tr -d '\n')",
"AZURE_CLIENT_ID_B64": "$(echo "${AZURE_CLIENT_ID}" | tr -d '\n' | base64 | tr -d '\n')",
}
}
EOF
$REGISTRY
should be in the formatdocker.io/<dockerhub-username>
The cluster-api management components that are deployed are configured at the /config
folder of each repository respectively. Making changes to those files will trigger a redeploy of the management cluster components.
After your kind management cluster is up and running with Tilt, you can configure workload cluster settings and deploy a workload cluster with the following:
make create-workload-cluster
To delete the cluster:
make delete-workload-cluster
Check out the troubleshooting guide for common errors you might run into.
The CAPZ controller emits tracing and metrics data. When run in Tilt, the KinD management cluster is provisioned with development deployments of OpenTelemetry for collecting distributed traces, Jaeger for viewing traces, and Prometheus for scraping and visualizing metrics.
The OpenTelemetry, Jaeger, and Prometheus deployments are for development purposes only. These illustrate the hooks for tracing and metrics, but lack the robustness of production cluster deployments. For example, the Jaeger "all-in-one" component only keeps traces in memory, not in a persistent store.
To view traces in the Jaeger interface, wait until the Tilt cluster is fully initialized. Then open the Tilt web interface, select the "traces: jaeger-all-in-one" resource, and click "View traces" near the top of the screen. Or visit http://localhost:16686/ in your browser.
To view traces in App Insights, follow the
tracing documentation before running
make tilt-up
. Then open the Azure Portal in your browser. Find the App Insights resource you
specified in AZURE_INSTRUMENTATION_KEY
, choose "Transaction search" on the left, and click
"Refresh" to see recent trace data.
To view metrics in the Prometheus interface, open the Tilt web interface, select the "metrics: prometheus-operator" resource, and click "View metrics" near the top of the screen. Or visit http://localhost:9090/ in your browser.
The steps below are provided in a convenient script in hack/create-dev-cluster.sh. Be sure to set AZURE_CLIENT_ID
, AZURE_CLIENT_SECRET
, AZURE_SUBSCRIPTION_ID
, and AZURE_TENANT_ID
before running. Optionally, you can override the different cluster configuration variables. For example, to override the workload cluster name:
CLUSTER_NAME=<my-capz-cluster-name> ./hack/create-dev-cluster.sh
NOTE: CLUSTER_NAME
can only include letters, numbers, and hyphens and can't be longer than 44 characters.
-
To build images with custom tags, run the
make docker-build
as follows:export REGISTRY="<container-registry>" export MANAGER_IMAGE_TAG="<image-tag>" # optional - defaults to `dev`. PULL_POLICY=IfNotPresent make docker-build
-
(optional) Push your docker images:
2.1. Login to your container registry using
docker login
.e.g.,
docker login quay.io
2.2. Push to your custom image registry:
REGISTRY="<container-registry>" MANAGER_IMAGE_TAG="<image-tag>" make docker-push
NOTE:
make create-cluster
will fetch the manager image locally and load it onto the kind cluster if it is present.
Here is a list of required configuration parameters (the full list is available in templates/cluster-template.yaml
):
# Cluster settings.
export CLUSTER_NAME="capz-cluster"
export AZURE_VNET_NAME=${CLUSTER_NAME}-vnet
# Azure settings.
export AZURE_LOCATION="southcentralus"
export AZURE_RESOURCE_GROUP=${CLUSTER_NAME}
export AZURE_SUBSCRIPTION_ID_B64="$(echo -n "$AZURE_SUBSCRIPTION_ID" | base64 | tr -d '\n')"
export AZURE_TENANT_ID_B64="$(echo -n "$AZURE_TENANT_ID" | base64 | tr -d '\n')"
export AZURE_CLIENT_ID_B64="$(echo -n "$AZURE_CLIENT_ID" | base64 | tr -d '\n')"
export AZURE_CLIENT_SECRET_B64="$(echo -n "$AZURE_CLIENT_SECRET" | base64 | tr -d '\n')"
# Machine settings.
export CONTROL_PLANE_MACHINE_COUNT=3
export AZURE_CONTROL_PLANE_MACHINE_TYPE="Standard_D2s_v3"
export AZURE_NODE_MACHINE_TYPE="Standard_D2s_v3"
export WORKER_MACHINE_COUNT=2
export KUBERNETES_VERSION="v1.21.2"
# Generate SSH key.
# If you want to provide your own key, skip this step and set AZURE_SSH_PUBLIC_KEY_B64 to your existing file.
SSH_KEY_FILE=.sshkey
rm -f "${SSH_KEY_FILE}" 2>/dev/null
ssh-keygen -t rsa -b 2048 -f "${SSH_KEY_FILE}" -N '' 1>/dev/null
echo "Machine SSH key generated in ${SSH_KEY_FILE}"
# For Linux the ssh key needs to be b64 encoded because we use the azure api to set it
# Windows doesn't support setting ssh keys so we use cloudbase-init to set which doesn't require base64
export AZURE_SSH_PUBLIC_KEY_B64=$(cat "${SSH_KEY_FILE}.pub" | base64 | tr -d '\r\n')
export AZURE_SSH_PUBLIC_KEY=$(cat "${SSH_KEY_FILE}.pub" | tr -d '\r\n')
clusterctl
to create the cluster
or the use of envsubst
to replace these values
Ensure dev environment has been reset:
make clean kind-reset
Create the cluster:
make create-cluster
Check out the troubleshooting guide for common errors you might run into.
Telemetry is the key to operational transparency. We strive to provide insight into the internal behavior of the system through observable traces and metrics.
Distributed tracing provides a hierarchical view of how and why an event occurred. CAPZ is instrumented to trace each
controller reconcile loop. When the reconcile loop begins, a trace span begins and is stored in loop context.Context
.
As the context is passed on to functions below, new spans are created, tied to the parent span by the parent span ID.
The spans form a hierarchical representation of the activities in the controller.
These spans can also be propagated across service boundaries. The span context can be passed on through metadata such as HTTP headers. By propagating span context, it creates a distributed, causal relationship between services and functions.
For tracing, we use OpenTelemetry.
Here is an example of staring a span in the beginning of a controller reconcile.
ctx, span := tele.Tracer().Start(ctx, "controllers.AzureMachineReconciler.Reconcile",
trace.WithAttributes(
attribute.String("namespace", req.Namespace),
attribute.String("name", req.Name),
attribute.String("kind", "AzureMachine"),
))
defer span.End()
The code above creates a context with a new span stored in the context.Context value bag. If a span already existed in
the ctx
arguement, then the new span would take on the parentID of the existing span, otherwise the new span
becomes a "root span", one that does not have a parent. The span is also created with labels, or tags, which
provide metadata about the span and can be used to query in many distributed tracing systems.
You should consider adding tracing if your func accepts a context.
Metrics provide quantitative data about the operations of the controller. This includes cumulative data like counters, single numerical values like guages, and distributions of counts / samples like histograms & summaries.
In CAPZ we expose metrics using the Prometheus client. The Kubebuilder project provides a guide for metrics and for exposing new ones.
Pull requests and issues are highly encouraged! If you're interested in submitting PRs to the project, please be sure to run some initial checks prior to submission:
make lint # Runs a suite of quick scripts to check code structure
make test # Runs tests on the Go code
make test
executes the project's unit tests. These tests do not stand up a
Kubernetes cluster, nor do they have external dependencies.
Mocks for the services tests are generated using GoMock.
To generate the mocks you can run
make generate-go
To run E2E locally, set AZURE_CLIENT_ID
, AZURE_CLIENT_SECRET
, AZURE_SUBSCRIPTION_ID
, AZURE_TENANT_ID
, and run:
./scripts/ci-e2e.sh
You can optionally set the following variables:
Variable | Description | Default |
---|---|---|
E2E_CONF_FILE |
The path of the E2E configuration file. | ${GOPATH}/src/sigs.k8s.io/cluster-api-provider-azure/test/e2e/config/azure-dev.yaml |
SKIP_CLEANUP |
Set to true if you do not want the bootstrap and workload clusters to be cleaned up after running E2E tests. |
false |
SKIP_CREATE_MGMT_CLUSTER |
Skip management cluster creation. If skipping managment cluster creation you must specify KUBECONFIG and SKIP_CLEANUP |
false |
LOCAL_ONLY |
Use Kind local registry and run the subset of tests which don't require a remotely pushed controller image. | true |
REGISTRY |
Registry to push the controller image. | capzci.azurecr.io/ci-e2e |
CLUSTER_NAME |
Name of an existing workload cluster. Will run specs against existing workload cluster. Use in conjunction with SKIP_CREATE_MGMT_CLUSTER , GINKGO_FOCUS and KUBECONFIG . Must specify only one e2e spec to run against with GINKGO_FOCUS such as export GINKO_FOCUS=Creating.a.VMSS.cluster.with.a.single.control.plane.node . |
|
KUBECONFIG |
Used with SKIP_CREATE_MGMT_CLUSTER set to true. Location of kubeconfig for the management cluster you would like to use. Use kind get kubeconfig --name capz-e2e > kubeconfig.capz-e2e to get the capz e2e kind cluster config |
'~/.kube/config' |
You can also customize the configuration of the CAPZ cluster created by the E2E tests (except for CLUSTER_NAME
, AZURE_RESOURCE_GROUP
, AZURE_VNET_NAME
, CONTROL_PLANE_MACHINE_COUNT
, and WORKER_MACHINE_COUNT
, since they are generated by individual test cases). See Customizing the cluster deployment for more details.
To run the Kubernetes Conformance test suite locally, you can run
./scripts/ci-conformance.sh
Optional settings are:
Environment Variable | Default Value | Description |
---|---|---|
WINDOWS |
false |
Run conformance against Windows nodes |
CONFORMANCE_NODES |
1 |
Number of parallel ginkgo nodes to run |
With the following environment variables defined, you can build a CAPZ cluster from the HEAD of Kubernetes main branch or release branch, and run the Conformance test suite against it. This is not enabled for Windows currently.
Environment Variable | Value |
---|---|
E2E_ARGS |
-kubetest.use-ci-artifacts |
KUBERNETES_VERSION |
latest - extract Kubernetes version from https://dl.k8s.io/ci/latest.txt (main's HEAD)latest-1.21 - extract Kubernetes version from https://dl.k8s.io/ci/latest-1.21.txt (release branch's HEAD) |
With the following environment variables defined, CAPZ runs ./scripts/ci-build-kubernetes.sh
as part of ./scripts/ci-conformance.sh
, which allows developers to build Kubernetes from source and run the Kubernetes Conformance test suite against a CAPZ cluster based on the custom build:
Environment Variable | Value |
---|---|
AZURE_STORAGE_ACCOUNT |
Your Azure storage account name |
AZURE_STORAGE_KEY |
Your Azure storage key |
JOB_NAME |
test (an enviroment variable used by CI, can be any non-empty string) |
LOCAL_ONLY |
false |
REGISTRY |
Your Registry |
TEST_K8S |
true |
To run a custom test suite on a CAPZ cluster locally, set AZURE_CLIENT_ID
, AZURE_CLIENT_SECRET
, AZURE_SUBSCRIPTION_ID
, AZURE_TENANT_ID
and run:
./scripts/ci-entrypoint.sh bash -c "cd ${GOPATH}/src/github.com/my-org/my-project && make e2e"
You can optionally set the following variables:
Variable | Description |
---|---|
AZURE_SSH_PUBLIC_KEY_FILE |
Use your own SSH key. |
SKIP_CLEANUP |
Skip deleting the cluster after the tests finish running. |
KUBECONFIG |
Provide your existing cluster kubeconfig filepath. If no kubeconfig is provided, ./kubeconfig will be used. |
USE_CI_ARTIFACTS |
Use a CI version of Kubernetes, ie. not a released version (eg. v1.19.0-alpha.1.426+0926c9c47677e9 ) |
CI_VERSION |
Provide a custom CI version of Kubernetes. By default, the latest master commit will be used. |
TEST_CCM |
Build a cluster that uses custom versions of the Azure cloud-provider cloud-controller-manager and node-controller-manager images |
EXP_MACHINE_POOL |
Use Machine Pool for worker machines. |
REGISTRY |
Registry to push any custom k8s images or cloud provider images built. |
CLUSTER_TEMPLATE |
Use a custom cluster template. By default, the script will choose the appropriate cluster template based on existing environment variabes. |
You can also customize the configuration of the CAPZ cluster (assuming that SKIP_CREATE_WORKLOAD_CLUSTER
is not set). See Customizing the cluster deployment for more details.