Skip to content

Make teamId in cluster name optional #2001

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 24, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions charts/postgres-operator/crds/operatorconfigurations.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,9 @@ spec:
enable_spilo_wal_path_compat:
type: boolean
default: false
enable_team_id_clustername_prefix:
type: boolean
default: false
etcd_host:
type: string
default: ""
Expand Down
2 changes: 2 additions & 0 deletions charts/postgres-operator/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ configGeneral:
enable_shm_volume: true
# enables backwards compatible path between Spilo 12 and Spilo 13+ images
enable_spilo_wal_path_compat: false
# operator will sync only clusters where name starts with teamId prefix
enable_team_id_clustername_prefix: false
# etcd connection string for Patroni. Empty uses K8s-native DCS.
etcd_host: ""
# Spilo docker image
Expand Down
3 changes: 1 addition & 2 deletions docs/reference/cluster_manifest.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,8 +53,7 @@ Those parameters are grouped under the `metadata` top-level key.
These parameters are grouped directly under the `spec` key in the manifest.

* **teamId**
name of the team the cluster belongs to. Changing it after the cluster
creation is not supported. Required field.
name of the team the cluster belongs to. Required field.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should inform about settings and implications. it may still be impossible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's now possible to change it later. When enable_team_id_clustername_prefix is enabled you will only get an error on sync that cluster name does not match the { TEAM-ID }-{... format with status being set to inavlid.


* **numberOfInstances**
total number of instances for a given cluster. The operator parameters
Expand Down
5 changes: 5 additions & 0 deletions docs/reference/operator_parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,11 @@ Those are top-level keys, containing both leaf keys and groups.
* **enable_spilo_wal_path_compat**
enables backwards compatible path between Spilo 12 and Spilo 13+ images. The default is `false`.

* **enable_team_id_clustername_prefix**
To lower the risk of name clashes between clusters of different teams you
can turn on this flag and the operator will sync only clusters where the
name starts with the `teamId` (from `spec`) plus `-`. Default is `false`.

* **etcd_host**
Etcd connection string for Patroni defined as `host:port`. Not required when
Patroni native Kubernetes support is used. The default is empty (use
Expand Down
11 changes: 6 additions & 5 deletions docs/user.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,11 +45,12 @@ Make sure, the `spec` section of the manifest contains at least a `teamId`, the
The minimum volume size to run the `postgresql` resource on Elastic Block
Storage (EBS) is `1Gi`.

Note, that the name of the cluster must start with the `teamId` and `-`. At
Zalando we use team IDs (nicknames) to lower the chance of duplicate cluster
names and colliding entities. The team ID would also be used to query an API to
get all members of a team and create [database roles](#teams-api-roles) for
them. Besides, the maximum cluster name length is 53 characters.
Note, that when `enable_team_id_clustername_prefix` is set to `true` the name
of the cluster must start with the `teamId` and `-`. At Zalando we use team IDs
(nicknames) to lower chances of duplicate cluster names and colliding entities.
The team ID would also be used to query an API to get all members of a team
and create [database roles](#teams-api-roles) for them. Besides, the maximum
cluster name length is 53 characters.

## Watch pods being created

Expand Down
1 change: 1 addition & 0 deletions manifests/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ data:
# enable_shm_volume: "true"
# enable_sidecars: "true"
enable_spilo_wal_path_compat: "true"
enable_team_id_clustername_prefix: "false"
enable_team_member_deprecation: "false"
# enable_team_superuser: "false"
enable_teams_api: "false"
Expand Down
3 changes: 3 additions & 0 deletions manifests/operatorconfiguration.crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,9 @@ spec:
enable_spilo_wal_path_compat:
type: boolean
default: false
enable_team_id_clustername_prefix:
type: boolean
default: false
etcd_host:
type: string
default: ""
Expand Down
1 change: 1 addition & 0 deletions manifests/postgresql-operator-default-configuration.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ configuration:
enable_pgversion_env_var: true
# enable_shm_volume: true
enable_spilo_wal_path_compat: false
enable_team_id_clustername_prefix: false
etcd_host: ""
# ignore_instance_limits_annotation_key: ""
# kubernetes_use_configmaps: false
Expand Down
3 changes: 3 additions & 0 deletions pkg/apis/acid.zalan.do/v1/crds.go
Original file line number Diff line number Diff line change
Expand Up @@ -1112,6 +1112,9 @@ var OperatorConfigCRDResourceValidation = apiextv1.CustomResourceValidation{
"enable_spilo_wal_path_compat": {
Type: "boolean",
},
"enable_team_id_clustername_prefix": {
Type: "boolean",
},
"etcd_host": {
Type: "string",
},
Expand Down
8 changes: 1 addition & 7 deletions pkg/apis/acid.zalan.do/v1/marshal.go
Original file line number Diff line number Diff line change
Expand Up @@ -110,15 +110,9 @@ func (p *Postgresql) UnmarshalJSON(data []byte) error {
}
tmp2 := Postgresql(tmp)

if clusterName, err := extractClusterName(tmp2.ObjectMeta.Name, tmp2.Spec.TeamID); err != nil {
tmp2.Error = err.Error()
tmp2.Status = PostgresStatus{PostgresClusterStatus: ClusterStatusInvalid}
} else if err := validateCloneClusterDescription(tmp2.Spec.Clone); err != nil {

if err := validateCloneClusterDescription(tmp2.Spec.Clone); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this related to clone? maybe better function name? or is this indeed about cloning?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is a validation of the clone section. Existed also before.

tmp2.Error = err.Error()
tmp2.Status.PostgresClusterStatus = ClusterStatusInvalid
} else {
tmp2.Spec.ClusterName = clusterName
}

*p = tmp2
Expand Down
59 changes: 30 additions & 29 deletions pkg/apis/acid.zalan.do/v1/operator_configuration_type.go
Original file line number Diff line number Diff line change
Expand Up @@ -228,35 +228,36 @@ type OperatorLogicalBackupConfiguration struct {

// OperatorConfigurationData defines the operation config
type OperatorConfigurationData struct {
EnableCRDRegistration *bool `json:"enable_crd_registration,omitempty"`
EnableCRDValidation *bool `json:"enable_crd_validation,omitempty"`
CRDCategories []string `json:"crd_categories,omitempty"`
EnableLazySpiloUpgrade bool `json:"enable_lazy_spilo_upgrade,omitempty"`
EnablePgVersionEnvVar bool `json:"enable_pgversion_env_var,omitempty"`
EnableSpiloWalPathCompat bool `json:"enable_spilo_wal_path_compat,omitempty"`
EtcdHost string `json:"etcd_host,omitempty"`
KubernetesUseConfigMaps bool `json:"kubernetes_use_configmaps,omitempty"`
DockerImage string `json:"docker_image,omitempty"`
Workers uint32 `json:"workers,omitempty"`
ResyncPeriod Duration `json:"resync_period,omitempty"`
RepairPeriod Duration `json:"repair_period,omitempty"`
SetMemoryRequestToLimit bool `json:"set_memory_request_to_limit,omitempty"`
ShmVolume *bool `json:"enable_shm_volume,omitempty"`
SidecarImages map[string]string `json:"sidecar_docker_images,omitempty"` // deprecated in favour of SidecarContainers
SidecarContainers []v1.Container `json:"sidecars,omitempty"`
PostgresUsersConfiguration PostgresUsersConfiguration `json:"users"`
MajorVersionUpgrade MajorVersionUpgradeConfiguration `json:"major_version_upgrade"`
Kubernetes KubernetesMetaConfiguration `json:"kubernetes"`
PostgresPodResources PostgresPodResourcesDefaults `json:"postgres_pod_resources"`
Timeouts OperatorTimeouts `json:"timeouts"`
LoadBalancer LoadBalancerConfiguration `json:"load_balancer"`
AWSGCP AWSGCPConfiguration `json:"aws_or_gcp"`
OperatorDebug OperatorDebugConfiguration `json:"debug"`
TeamsAPI TeamsAPIConfiguration `json:"teams_api"`
LoggingRESTAPI LoggingRESTAPIConfiguration `json:"logging_rest_api"`
Scalyr ScalyrConfiguration `json:"scalyr"`
LogicalBackup OperatorLogicalBackupConfiguration `json:"logical_backup"`
ConnectionPooler ConnectionPoolerConfiguration `json:"connection_pooler"`
EnableCRDRegistration *bool `json:"enable_crd_registration,omitempty"`
EnableCRDValidation *bool `json:"enable_crd_validation,omitempty"`
CRDCategories []string `json:"crd_categories,omitempty"`
EnableLazySpiloUpgrade bool `json:"enable_lazy_spilo_upgrade,omitempty"`
EnablePgVersionEnvVar bool `json:"enable_pgversion_env_var,omitempty"`
EnableSpiloWalPathCompat bool `json:"enable_spilo_wal_path_compat,omitempty"`
EnableTeamIdClusternamePrefix bool `json:"enable_team_id_clustername_prefix,omitempty"`
EtcdHost string `json:"etcd_host,omitempty"`
KubernetesUseConfigMaps bool `json:"kubernetes_use_configmaps,omitempty"`
DockerImage string `json:"docker_image,omitempty"`
Workers uint32 `json:"workers,omitempty"`
ResyncPeriod Duration `json:"resync_period,omitempty"`
RepairPeriod Duration `json:"repair_period,omitempty"`
SetMemoryRequestToLimit bool `json:"set_memory_request_to_limit,omitempty"`
ShmVolume *bool `json:"enable_shm_volume,omitempty"`
SidecarImages map[string]string `json:"sidecar_docker_images,omitempty"` // deprecated in favour of SidecarContainers
SidecarContainers []v1.Container `json:"sidecars,omitempty"`
PostgresUsersConfiguration PostgresUsersConfiguration `json:"users"`
MajorVersionUpgrade MajorVersionUpgradeConfiguration `json:"major_version_upgrade"`
Kubernetes KubernetesMetaConfiguration `json:"kubernetes"`
PostgresPodResources PostgresPodResourcesDefaults `json:"postgres_pod_resources"`
Timeouts OperatorTimeouts `json:"timeouts"`
LoadBalancer LoadBalancerConfiguration `json:"load_balancer"`
AWSGCP AWSGCPConfiguration `json:"aws_or_gcp"`
OperatorDebug OperatorDebugConfiguration `json:"debug"`
TeamsAPI TeamsAPIConfiguration `json:"teams_api"`
LoggingRESTAPI LoggingRESTAPIConfiguration `json:"logging_rest_api"`
Scalyr ScalyrConfiguration `json:"scalyr"`
LogicalBackup OperatorLogicalBackupConfiguration `json:"logical_backup"`
ConnectionPooler ConnectionPoolerConfiguration `json:"connection_pooler"`

MinInstances int32 `json:"min_instances,omitempty"`
MaxInstances int32 `json:"max_instances,omitempty"`
Expand Down
2 changes: 1 addition & 1 deletion pkg/apis/acid.zalan.do/v1/util.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ func parseWeekday(s string) (time.Weekday, error) {
return time.Weekday(weekday), nil
}

func extractClusterName(clusterName string, teamName string) (string, error) {
func ExtractClusterName(clusterName string, teamName string) (string, error) {
teamNameLen := len(teamName)
if len(clusterName) < teamNameLen+2 {
return "", fmt.Errorf("cluster name must match {TEAM}-{NAME} format. Got cluster name '%v', team name '%v'", clusterName, teamName)
Expand Down
24 changes: 2 additions & 22 deletions pkg/apis/acid.zalan.do/v1/util_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -330,29 +330,11 @@ var unmarshalCluster = []struct {
Clone: &CloneDescription{
ClusterName: "acid-batman",
},
ClusterName: "testcluster1",
},
Error: "",
},
marshal: []byte(`{"kind":"Postgresql","apiVersion":"acid.zalan.do/v1","metadata":{"name":"acid-testcluster1","creationTimestamp":null},"spec":{"postgresql":{"version":"9.6","parameters":{"log_statement":"all","max_connections":"10","shared_buffers":"32MB"}},"pod_priority_class_name":"spilo-pod-priority","volume":{"size":"5Gi","storageClass":"SSD", "subPath": "subdir"},"enableShmVolume":false,"patroni":{"initdb":{"data-checksums":"true","encoding":"UTF8","locale":"en_US.UTF-8"},"pg_hba":["hostssl all all 0.0.0.0/0 md5","host all all 0.0.0.0/0 md5"],"ttl":30,"loop_wait":10,"retry_timeout":10,"maximum_lag_on_failover":33554432,"slots":{"permanent_logical_1":{"database":"foo","plugin":"pgoutput","type":"logical"}}},"resources":{"requests":{"cpu":"10m","memory":"50Mi"},"limits":{"cpu":"300m","memory":"3000Mi"}},"teamId":"acid","allowedSourceRanges":["127.0.0.1/32"],"numberOfInstances":2,"users":{"zalando":["superuser","createdb"]},"maintenanceWindows":["Mon:01:00-06:00","Sat:00:00-04:00","05:00-05:15"],"clone":{"cluster":"acid-batman"}},"status":{"PostgresClusterStatus":""}}`),
err: nil},
{
about: "example with teamId set in input",
in: []byte(`{"kind": "Postgresql","apiVersion": "acid.zalan.do/v1","metadata": {"name": "teapot-testcluster1"}, "spec": {"teamId": "acid"}}`),
out: Postgresql{
TypeMeta: metav1.TypeMeta{
Kind: "Postgresql",
APIVersion: "acid.zalan.do/v1",
},
ObjectMeta: metav1.ObjectMeta{
Name: "teapot-testcluster1",
},
Spec: PostgresSpec{TeamID: "acid"},
Status: PostgresStatus{PostgresClusterStatus: ClusterStatusInvalid},
Error: errors.New("name must match {TEAM}-{NAME} format").Error(),
},
marshal: []byte(`{"kind":"Postgresql","apiVersion":"acid.zalan.do/v1","metadata":{"name":"teapot-testcluster1","creationTimestamp":null},"spec":{"postgresql":{"version":"","parameters":null},"volume":{"size":"","storageClass":""},"patroni":{"initdb":null,"pg_hba":null,"ttl":0,"loop_wait":0,"retry_timeout":0,"maximum_lag_on_failover":0,"slots":null},"teamId":"acid","allowedSourceRanges":null,"numberOfInstances":0,"users":null,"clone":null},"status":{"PostgresClusterStatus":"Invalid"}}`),
err: nil},
{
about: "example with clone",
in: []byte(`{"kind": "Postgresql","apiVersion": "acid.zalan.do/v1","metadata": {"name": "acid-testcluster1"}, "spec": {"teamId": "acid", "clone": {"cluster": "team-batman"}}}`),
Expand All @@ -369,7 +351,6 @@ var unmarshalCluster = []struct {
Clone: &CloneDescription{
ClusterName: "team-batman",
},
ClusterName: "testcluster1",
},
Error: "",
},
Expand All @@ -391,7 +372,6 @@ var unmarshalCluster = []struct {
StandbyCluster: &StandbyDescription{
S3WalPath: "s3://custom/path/to/bucket/",
},
ClusterName: "testcluster1",
},
Error: "",
},
Expand Down Expand Up @@ -628,10 +608,10 @@ func TestServiceAnnotations(t *testing.T) {
func TestClusterName(t *testing.T) {
for _, tt := range clusterNames {
t.Run(tt.about, func(t *testing.T) {
name, err := extractClusterName(tt.in, tt.inTeam)
name, err := ExtractClusterName(tt.in, tt.inTeam)
if err != nil {
if tt.err == nil || err.Error() != tt.err.Error() {
t.Errorf("extractClusterName expected error: %v, got: %v", tt.err, err)
t.Errorf("ExtractClusterName expected error: %v, got: %v", tt.err, err)
}
return
} else if tt.err != nil {
Expand Down
7 changes: 6 additions & 1 deletion pkg/cluster/util.go
Original file line number Diff line number Diff line change
Expand Up @@ -244,7 +244,12 @@ func getPostgresContainer(podSpec *v1.PodSpec) (pgContainer v1.Container) {
func (c *Cluster) getTeamMembers(teamID string) ([]string, error) {

if teamID == "" {
return nil, fmt.Errorf("no teamId specified")
msg := "no teamId specified"
if c.OpConfig.EnableTeamIdClusternamePrefix {
return nil, fmt.Errorf(msg)
}
c.logger.Warnf(msg)
return nil, nil
}

members := []string{}
Expand Down
1 change: 1 addition & 0 deletions pkg/controller/operator_config.go
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ func (c *Controller) importConfigurationFromCRD(fromCRD *acidv1.OperatorConfigur
result.EnableLazySpiloUpgrade = fromCRD.EnableLazySpiloUpgrade
result.EnablePgVersionEnvVar = fromCRD.EnablePgVersionEnvVar
result.EnableSpiloWalPathCompat = fromCRD.EnableSpiloWalPathCompat
result.EnableTeamIdClusternamePrefix = fromCRD.EnableTeamIdClusternamePrefix
result.EtcdHost = fromCRD.EtcdHost
result.KubernetesUseConfigMaps = fromCRD.KubernetesUseConfigMaps
result.DockerImage = util.Coalesce(fromCRD.DockerImage, "registry.opensource.zalan.do/acid/spilo-14:2.1-p6")
Expand Down
38 changes: 29 additions & 9 deletions pkg/controller/postgresql.go
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,15 @@ func (c *Controller) acquireInitialListOfClusters() error {
return nil
}

func (c *Controller) addCluster(lg *logrus.Entry, clusterName spec.NamespacedName, pgSpec *acidv1.Postgresql) *cluster.Cluster {
func (c *Controller) addCluster(lg *logrus.Entry, clusterName spec.NamespacedName, pgSpec *acidv1.Postgresql) (*cluster.Cluster, error) {

if c.opConfig.EnableTeamIdClusternamePrefix {
if _, err := acidv1.ExtractClusterName(clusterName.Name, pgSpec.Spec.TeamID); err != nil {
c.KubeClient.SetPostgresCRDStatus(clusterName, acidv1.ClusterStatusInvalid)
return nil, err
}
}

cl := cluster.New(c.makeClusterConfig(), c.KubeClient, *pgSpec, lg, c.eventRecorder)
cl.Run(c.stopCh)
teamName := strings.ToLower(cl.Spec.TeamID)
Expand All @@ -171,12 +179,13 @@ func (c *Controller) addCluster(lg *logrus.Entry, clusterName spec.NamespacedNam
c.clusterLogs[clusterName] = ringlog.New(c.opConfig.RingLogLines)
c.clusterHistory[clusterName] = ringlog.New(c.opConfig.ClusterHistoryEntries)

return cl
return cl, nil
}

func (c *Controller) processEvent(event ClusterEvent) {
var clusterName spec.NamespacedName
var clHistory ringlog.RingLogger
var err error

lg := c.logger.WithField("worker", event.WorkerID)

Expand Down Expand Up @@ -216,7 +225,7 @@ func (c *Controller) processEvent(event ClusterEvent) {
c.mergeDeprecatedPostgreSQLSpecParameters(&event.NewSpec.Spec)
}

if err := c.submitRBACCredentials(event); err != nil {
if err = c.submitRBACCredentials(event); err != nil {
c.logger.Warnf("pods and/or Patroni may misfunction due to the lack of permissions: %v", err)
}

Expand All @@ -231,15 +240,20 @@ func (c *Controller) processEvent(event ClusterEvent) {

lg.Infof("creating a new Postgres cluster")

cl = c.addCluster(lg, clusterName, event.NewSpec)
cl, err = c.addCluster(lg, clusterName, event.NewSpec)
if err != nil {
lg.Errorf("creation of cluster is blocked: %v", err)
return
}

c.curWorkerCluster.Store(event.WorkerID, cl)

if err := cl.Create(); err != nil {
err = cl.Create()
if err != nil {
cl.Status = acidv1.PostgresStatus{PostgresClusterStatus: acidv1.ClusterStatusInvalid}
cl.Error = fmt.Sprintf("could not create cluster: %v", err)
lg.Error(cl.Error)
c.eventRecorder.Eventf(cl.GetReference(), v1.EventTypeWarning, "Create", "%v", cl.Error)

return
}

Expand All @@ -252,7 +266,8 @@ func (c *Controller) processEvent(event ClusterEvent) {
return
}
c.curWorkerCluster.Store(event.WorkerID, cl)
if err := cl.Update(event.OldSpec, event.NewSpec); err != nil {
err = cl.Update(event.OldSpec, event.NewSpec)
if err != nil {
cl.Error = fmt.Sprintf("could not update cluster: %v", err)
lg.Error(cl.Error)

Expand Down Expand Up @@ -303,11 +318,16 @@ func (c *Controller) processEvent(event ClusterEvent) {

// no race condition because a cluster is always processed by single worker
if !clusterFound {
cl = c.addCluster(lg, clusterName, event.NewSpec)
cl, err = c.addCluster(lg, clusterName, event.NewSpec)
if err != nil {
lg.Errorf("syncing of cluster is blocked: %v", err)
return
}
}

c.curWorkerCluster.Store(event.WorkerID, cl)
if err := cl.Sync(event.NewSpec); err != nil {
err = cl.Sync(event.NewSpec)
if err != nil {
cl.Error = fmt.Sprintf("could not sync cluster: %v", err)
c.eventRecorder.Eventf(cl.GetReference(), v1.EventTypeWarning, "Sync", "%v", cl.Error)
lg.Error(cl.Error)
Expand Down
1 change: 1 addition & 0 deletions pkg/util/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,7 @@ type Config struct {
EnableCrossNamespaceSecret bool `name:"enable_cross_namespace_secret" default:"false"`
EnablePgVersionEnvVar bool `name:"enable_pgversion_env_var" default:"true"`
EnableSpiloWalPathCompat bool `name:"enable_spilo_wal_path_compat" default:"false"`
EnableTeamIdClusternamePrefix bool `name:"enable_team_id_clustername_prefix" default:"false"`
MajorVersionUpgradeMode string `name:"major_version_upgrade_mode" default:"off"`
MajorVersionUpgradeTeamAllowList []string `name:"major_version_upgrade_team_allow_list" default:""`
MinimalMajorVersion string `name:"minimal_major_version" default:"9.6"`
Expand Down