[docs] Update documentation to cover some recent improvements

gyfora · gyfora · commit 80b13b52183e · 2023-05-16T09:18:28.000+02:00
diff --git a/README.md b/README.md
@@ -17,6 +17,7 @@ Check our [quick-start](https://nightlies.apache.org/flink/flink-kubernetes-oper
  - Upgrade, suspend and delete deployments
  - Full logging and metrics integration
  - Flexible deployments and native integration with Kubernetes tooling
+ - Flink Job Autoscaler
 
 For the complete feature-set please refer to our [documentation](https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/concepts/overview/).
 
diff --git a/docs/content/docs/concepts/overview.md b/docs/content/docs/concepts/overview.md
@@ -36,7 +36,7 @@ Flink Kubernetes Operator aims to capture the responsibilities of a human operat
   - Stateful and stateless application upgrades
   - Triggering and managing savepoints
   - Handling errors, rolling-back broken upgrades
-- Multiple Flink version support: v1.13, v1.14, v1.15, v1.16
+- Multiple Flink version support: v1.13, v1.14, v1.15, v1.16, v1.17
 - [Deployment Modes]({{< ref "docs/custom-resource/overview#application-deployments" >}}):
   - Application cluster
   - Session cluster
@@ -52,6 +52,10 @@ Flink Kubernetes Operator aims to capture the responsibilities of a human operat
 - POD augmentation via [Pod Templates]({{< ref "docs/custom-resource/pod-template" >}})
   - Native Kubernetes POD definitions
   - Layering (Base/JobManager/TaskManager overrides)
+- [Job Autoscaler]({{< ref "docs/custom-resource/autoscaler" >}})
+  - Collect lag and utilization metrics
+  - Scale job vertices to the ideal parallelism
+  - Scale up and down as the load changes
 ### Operations
 - Operator [Metrics]({{< ref "docs/operations/metrics-logging#metrics" >}})
   - Utilizes the well-established [Flink Metric System](https://nightlies.apache.org/flink/flink-docs-master/docs/ops/metrics)
@@ -101,5 +105,5 @@ drwxr-xr-x 2 9999 9999 60 May 11 15:11 b6fb2a9c-d1cd-4e65-a9a1-e825c4b47543
 ```
 
 ### AuditUtils can log sensitive information present in the custom resources
-As reported in [FLINK-30306](https://issues.apache.org/jira/browse/FLINK-30306) when Flink custom resources change the operator logs the change, which could include sensitive information. We suggest ingesting secrets to Flink containers during runtime to mitigate this. 
-Also note that anyone who has access to the custom resources already had access to the potentially sensitive information in question, but folks who only have access to the logs could also see them now. We are planning to introduce redaction rules to AuditUtils to improve this in a later release.
+As reported in [FLINK-30306](https://issues.apache.org/jira/browse/FLINK-30306) when Flink custom resources change the operator logs the change, which could include sensitive information. We suggest ingesting secrets to Flink containers during runtime to mitigate this.
+Also note that anyone who has access to the custom resources already had access to the potentially sensitive information in question, but folks who only have access to the logs could also see them now. We are planning to introduce redaction rules to AuditUtils to improve this in a later release.
diff --git a/docs/content/docs/custom-resource/job-management.md b/docs/content/docs/custom-resource/job-management.md
@@ -98,7 +98,7 @@ The `upgradeMode` setting controls both the stop and restore mechanisms as detai
 The three upgrade modes are intended to support different scenarios:
 
  1. **stateless**: Stateless application upgrades from empty state
- 2. **last-state**: Quick upgrades in any application state (even for failing jobs), does not require a healthy job as it always uses the latest checkpoint information. Manual recovery may be necessary if HA metadata is lost.
+ 2. **last-state**: Quick upgrades in any application state (even for failing jobs), does not require a healthy job as it always uses the latest checkpoint information. Manual recovery may be necessary if HA metadata is lost. To limit the time the job may fall back when picking up the latest checkpoint you can configure `kubernetes.operator.job.upgrade.last-state.max.allowed.checkpoint.age`. If the checkpoint is older than the configured value a savepoint will be taken instead for healthy jobs.
  3. **savepoint**: Use savepoint for upgrade, providing maximal safety and possibility to serve as backup/fork point. The savepoint will be created during the upgrade process. Note that the Flink job needs to be running to allow the savepoint to get created. If the job is in an unhealthy state, the last checkpoint will be used (unless `kubernetes.operator.job.upgrade.last-state-fallback.enabled` is set to `false`). If the last checkpoint is not available, the job upgrade will fail.
 
 During stateful upgrades there are always cases which might require user intervention to preserve the consistency of the application. Please see the [manual Recovery section](#manual-recovery) for details.
@@ -214,6 +214,9 @@ Savepoint cleanup happens lazily and only when the application is running.
 It is therefore very likely that savepoints live beyond the max age configuration.  
 {{< /hint >}}
 
+To disable savepoint cleanup by the operator you can set `kubernetes.operator.savepoint.cleanup.enabled: false`.
+When savepoint cleanup is disabled the operator will still collect and populate the savepoint history but not perform any dispose operations.
+
 ## Recovery of missing job deployments
 
 When HA is enabled, the operator can recover the Flink cluster deployments in cases when it was accidentally deleted
diff --git a/docs/content/docs/custom-resource/overview.md b/docs/content/docs/custom-resource/overview.md
@@ -87,7 +87,7 @@ Most deployments will define at least the following fields:
  - `image` : Docker used to run Flink job and task manager processes
  - `flinkVersion` : Flink version used in the image (`v1_13`, `v1_14`, `v1_15`, `v1_16` ...)
  - `serviceAccount` : Kubernetes service account used by the Flink pods
- - `taskManager, jobManager` : Job and Task manager pod resource specs (cpu, memory, etc.)
+ - `taskManager, jobManager` : Job and Task manager pod resource specs (cpu, memory, ephemeralStorage)
  - `flinkConfiguration` : Map of Flink configuration overrides such as HA and checkpointing configs
  - `job` : Job Spec for Application deployments
 
@@ -158,7 +158,7 @@ For standard Operator use running your own Flink Jobs Native mode is recommended
 
 Standalone cluster deployment simply uses Kubernetes as an orchestration platform that the Flink cluster is running on. Flink is unaware that it is running on Kubernetes and therefore all Kubernetes resources need to be managed externally, by the Kubernetes Operator.
 
-In Standalone mode the Flink cluster doesn't have access to the Kubernetes cluster so this can increase security. If unknown or external code is being ran on the Flink cluster then Standalone mode adds another layer of security. 
+In Standalone mode the Flink cluster doesn't have access to the Kubernetes cluster so this can increase security. If unknown or external code is being ran on the Flink cluster then Standalone mode adds another layer of security.
 
 The deployment mode can be set using the `mode` field in the deployment spec.
 
@@ -169,7 +169,7 @@ kind: FlinkDeployment
 spec:
   ...
   mode: standalone
-    
+
 
 ```
 
@@ -212,12 +212,11 @@ COPY flink-hadoop-fs-1.15-SNAPSHOT.jar $FLINK_PLUGINS_DIR/hadoop-fs/
 
 ### Limitations
 
-- The LastState UpgradeMode have not been supported.
+- Last-state upgradeMode is currently not supported for FlinkSessionJobs
 
 ## Further information
 
  - [Job Management and Stateful upgrades]({{< ref "docs/custom-resource/job-management" >}})
  - [Deployment customization and pod templates]({{< ref "docs/custom-resource/pod-template" >}})
  - [Full Reference]({{< ref "docs/custom-resource/reference" >}})
  - [Examples](https://github.com/apache/flink-kubernetes-operator/tree/main/examples)
-
diff --git a/docs/content/docs/custom-resource/pod-template.md b/docs/content/docs/custom-resource/pod-template.md
@@ -104,3 +104,33 @@ spec:
 When using the operator with Flink native Kubernetes integration, please refer to [pod template field precedence](
 https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#fields-overwritten-by-flink).
 {{< /hint >}}
+
+## Array Merging Behaviour
+
+When layering pod templates (defining both a top level and jobmanager specific podtemplate for example) the corresponding yamls are merged together.
+
+The default behaviour of the pod template mechanism is to merge array arrays by merging the objects in the respective array positions.
+This requires that containers in the podTemplates are defined in the same order otherwise the results may be undefined.
+
+Default behaviour (merge by position):
+
+```
+arr1: [{name: a, p1: v1}, {name: b, p1: v1}]
+arr1: [{name: a, p2: v2}, {name: c, p2: v2}]
+
+merged: [{name: a, p1: v1, p2: v2}, {name: c, p1: v1, p2: v2}]
+```
+
+The operator supports an alternative array merging mechanism that can be enabled by the `kubernetes.operator.pod-template.merge-arrays-by-name` flag.
+When true, instead of the default positional merging, object array elements that have a `name` property defined will be merged by their name and the resulting array will be a union of the two input arrays.
+
+Merge by name:
+
+```
+arr1: [{name: a, p1: v1}, {name: b, p1: v1}]
+arr1: [{name: a, p2: v2}, {name: c, p2: v2}]
+
+merged: [{name: a, p1: v1, p2: v2}, {name: b, p1: v1}, {name: c, p2: v2}]
+```
+
+Merging by name can we be very convenient when merging container specs or when the base and override templates are not defined together.
diff --git a/docs/content/docs/development/roadmap.md b/docs/content/docs/development/roadmap.md
@@ -31,6 +31,6 @@ It's not a comprehensive list and might be slightly outdated at any given time.
 
 ## What’s Next?
 
-- Standalone deployment mode support [FLIP-225](https://cwiki.apache.org/confluence/display/FLINK/FLIP-225%3A+Implement+standalone+mode+support+in+the+kubernetes+operator)
-- Improved scaling and autoscaling support
 - Improved rollback mechanism and stability conditions
+- Autoscaler hardening and improvements
+- Support for in-place job rescaling with Flink 1.18
diff --git a/docs/content/docs/operations/health.md b/docs/content/docs/operations/health.md
@@ -0,0 +1,69 @@
+---
+title: "Operator Health Monitoring"
+weight: 3
+type: docs
+aliases:
+- /operations/health.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Operator Health Monitoring
+
+## Health Probe
+
+The Flink Kubernetes Operator provides a built in health endpoint that serves as the information source for Kubernetes liveness and startup probes.
+
+The liveness and startup probes are enabled by default in the Helm chart:
+
+```
+operatorHealth:
+  port: 8085
+  livenessProbe:
+    periodSeconds: 10
+    initialDelaySeconds: 30
+  startupProbe:
+    failureThreshold: 30
+    periodSeconds: 10
+```
+
+The health endpoint catches startup and informer errors that are exposed by the JOSDK framework. By default if one of the watched namespaces becomes inaccessible the health endpoint will report an error and the operator will restart.
+
+In some cases it is desirable to keep the operator running even if some namespaces are inaccessible. To allow the operator to start even if some namespaces cannot be watched, you can disable the `kubernetes.operator.startup.stop-on-informer-error` flag.
+
+## Canary Resources
+
+The canary resource feature allows users to deploy special dummy resources (canaries) into selected namespaces. The operator health probe will then monitor that these resources are reconciled in a timely manner. This allows the operator health probe to catch any slowdowns, and other general reconciliation issues not covered otherwise.
+
+Canary deployments are identified by a special label: `"flink.apache.org/canary": "true"`. These resources do not need to define a spec and they will not start any pods or consume other cluster resources and are purely there to assert the operator reconciliation functionality.
+
+Canary FlinkDeployment:
+
+```
+apiVersion: flink.apache.org/v1beta1
+kind: FlinkDeployment
+metadata:
+  name: canary
+  labels:
+    "flink.apache.org/canary": "true"
+```
+
+The default timeout for reconciling the canary resources is 1 minute and it is controlled by `kubernetes.operator.health.canary.resource.timeout`. If the operator cannot reconcile the canaries within this time limit the operator is marked unhealthy and will be automatically restarted.
+
+Canaries can be deployed into multiple namespaces.