Deploying YDB cluster with Ansible

Ansible playbooks supporting the deployment of YDB clusters into VMs or baremetal servers.

Currently, the playbooks support the following scenarios:

the initial deployment of YDB static (storage) nodes;
YDB database creation;
the initial deployment of YDB dynamic (database) nodes;
adding extra YDB dynamic nodes to the YDB cluster;
updating cluster configuration file and TLS certificates, with automatic rolling restart.

The following scenarios are yet to be implemented (TODO):

configuring extra storage devices within the existing YDB static nodes;
adding extra YDB static nodes to the existing cluster;
removing YDB dynamic nodes from the existing cluster.

Current limitations:

supported python interpreter version on managed servers must be >= 3.7
configuration file customization depends on the support of automatic actor system threads management, which requires YDB version 23.1.26.hotfix1 or later;
the cluster configuration file has to be manually created;
there are no examples for configuring the storage nodes with different disk layouts (it seems to be doable by defining different ydb_disks values for different host groups).

Supported operation systems

Ubuntu 20.04, 22.04, 24.04
Debian 11.11, 12.7
AstraLinux 1.7, 1.8
AlmaLinux 8.9, 9.4, 9.5
Altlinux 8.4, 10, 10.1, 10.2
RedHat 9.3
RedOS 7.3, 8
CentOS 8
SberLinux 9.0* (Special requirements for isolated install)

Ansible Collection - ydb_platform.ydb

Documentation for the collection.

Playbook configuration settings

Default configuration settings are defined in the group_vars/all file as a set of Ansible variables. An example file is provided. Different playbook executions may require different variable values, which can be accomplished by specifying extra JSON-format files and passing those files in the command line.

The meaning and format of the variables used are specified in the table below.

Variable	Meaning
`ydb_libidn_archive`	Enable the installation of custom-built libidn for RHEL, AlmaLinux or Rocky Linux.
`ydb_libidn_archive_unpack_options`	Extra flags to be passed to `tar` for unpacking custom-built libidn package. Default value: `['--strip-component=1']`
`ydb_archive`	YDB server binary package in .tar.gz format
`ydb_archive_unpack_options`	Extra flags to be passed to `tar` for unpacking the YDB server binaries. Default value: `['--strip-component=1']`
`ydb_config`	The name of the cluster configuration file within the `files` subdirectory (without the `actor_system_config` snippet!)
`ydb_tls_dir`	Path to the local directory with the TLS certificates and keys, as generated by the sample script, or following the filename convention used by the sample script
`ydb_domain`	The name of the root domain hosting the databases, value `Root` is used in the YDB documentation
`ydb_disks`	Disk layout of storage nodes, defined as `ydbd_static` in the hosts file. Defined as list of structures having the following fields: `name` - physical device name (like `/dev/sdb` or `/dev/vdb`); `label` - the desired YDB data partition label, as used in the cluster configuration file (like `ydb_disk_1`)
`ydb_dynnodes`	Set of dynamic nodes to be ran on each host listed as `ydbd_dynamic` in the hosts file. Defined as list of structures having the following fields: `dbname` - name of the YDB database handled by the corresponding dynamic node; `instance` - dynamic node service instance name, allowing to distinguish between multiple dynamic nodes for the same database running in the same host; `offset` - integer number `0-N`, used as the offset for the standard network port numbers (`0` means using the standard ports).
`ydb_brokers`	List of host names running the YDB static nodes, exactly 3 (three) host names must be specified
`ydb_cores_static`	Number of cores to be used by thread pools of the static nodes
`ydb_cores_dynamic`	Number of cores to be used by thread pools of the dynamic nodes
`ydb_dbname`	Database name, for database creation, dynamic nodes deployment and dynamic nodes rolling restart
`ydb_pool_kind`	YDB default storage pool kind, as specified in the static nodes configuration file in the `storage_pool_types.kind` field
`ydb_database_groups`	Initial number of storage groups in the newly created database
`ydb_dynnode_restart_sleep_seconds`	Number of seconds to sleep after startup of each dynamic node during the rolling restart.

Installing the YDB cluster using Ansible

Overall installation is performed according to the official instruction, with several steps automated with Ansible. The steps below are adopted for the Ansible-based process:

Review the system requirements, and prepare the YDB hosts. Ensure that SSH access and sudo-based root privileges are available.
Prepare the TLS certificates, the provided sample script may be used for automation of this step.
Download the YDB server distribution. It is better to use the latest binary version available.
Ensure that you have Python 3.8 or later installed on all hosts of the cluster.
Configure the passwordless SSH access to all hosts of the cluster.
Configure the priviledge escalation on all hosts of the cluster, such as passwordless sudo for the user account with the SSH access.
Install ansible-core version 2.11-2.15. Ansible 2.10 or older is not supported.

Install the required YDB Ansible collections from Github:

ansible-galaxy collection install git+https://github.com/ydb-platform/ydb-ansible.git

Alternatively, download the current releases of Ansible collection for YDB. In addition, Prometheus and Grafana collections can be used to automatically deploy the monitoring services (optionally). Install the collections from the archives:

ansible-galaxy collection install prometheus-prometheus-X.Y.Z.tar.gz
ansible-galaxy collection install grafana-ansible-collection-X.Y.Z.tar.gz
ansible-galaxy collection install ydb-ansible-X.Y.tar.gz

In the new subdirectory, create the ansible.cfg file using the provided example.
Create the files and files/certs directories, and put the TLS keys and certificates there. If the certificates were generated using the provided helper script, the CA/certs/YYYY-MM-DD_hh-mm-ss subdirectory should typically be copied as files/certs.
Create the inventory/50-inventory.yaml, inventory/99-inventory-vault.yaml files. These files contain the host list, installation configuration and secrets to be used. The example files are provided: inventory.yaml, inventory-vault.yaml.
Create the Ansible Vault password file as ansible_vault_password_file, with the password to protect the sensible secrets.
Encrypt inventory/99-inventory-vault.yaml with ansible-vault encrypt inventory/99-inventory-vault.yaml command. To edit this file use command ansible-vault edit inventory/99-inventory-vault.yaml.
Prepare the cluster configuration file according to the instructions in the documentation, and save as files/config.yaml. Omit the actor_system_config section - it will be added automatically.
Create the setup playbook based on the provided example. Customize the required actions as needed.
Deploy the YDB cluster by running the playbook with the following command:
```
ansible-playbook ydb_platform.ydb.initial_setup
```

Updating the cluster configuration files

To update the YDB cluster configuration files (ydbd-config.yaml, TLS certificates and keys) using the Ansible playbook, the following actions are necessary:

Ensure that the hosts file contains the current list of YDB cluster nodes, both static and dynamic.
Ensure that the configuration variable ydbd_config in the group_vars/all file points to the desired YDB server configuration file.
Ensure that the configuration variable ydbd_tls_dir points to the directory containing the desired TLS key and certificate files for all the nodes within the YDB cluster.
Apply the updated configuration to the cluster by running the run-update-config.sh script. Ensure that the playbook has been completed successfully, and diagnose and fix execution errors if they happen.

Notes:

Please take into account that rolling restart is performed node by node, and for a large cluster the process may consume a significant amount of time.
For Certificate Authority (CA) certificate rotation, at least two separate configuration updates are needed:
- first to deploy the ca.crt file, containing both new and old CA certificates;
- second to deploy the fresh server keys and certificates signed by the new CA certificate.

What is actually done by the playbooks?

Actions executed for installing YDB nodes

libaio or libaio1 is installed, depending on the operating system
chrony is installed and enabled to ensure time synchronization
jq is installed to support some scripting logic used in the playbooks
YDB user group and user is created
YDB installation directory is created
YDB server software binary package is unpacked into the YDB installation directory
YDB client package automatic update checks are disabled for the YDB user, to avoid extra messages from client commands.
YDB TLS certificates and keys are copied to each server
YDB cluster configuration file is copied to each server
Transparent huge pages (THP) are enabled on each server, which is implemented by the creation, activation and start of the corresponding systemd service.

Actions executed for the initial deployment of YDB storage cluster

Installation actions are executed.
For each disk configured, it is checked for the existing YDB data. If none found, disk is completely re-partitioned, and obliterated. For the existing YDB data, no changes are made.

WARNING: the safety checks do not work for YDB disks using non-default encryption keys. DATA LOSS IS POSSIBLE if the encryption is actually used. Probably an enhancement is needed to support the encryption key to be specified in the deployment option.
ydbd-storage.service is created and configured as the systemd service.
ydbd-storage.service is started, and the playbook waits for static nodes to come up.
YDB blobstorage configuration is applied with the ydbd admin blobstorage init command.
The playbook waits for the completion of YDB storage initialization.
The initial password for the root user is configured according to contents of the files/secret file.

Actions executed for YDB dynamic nodes deployment

Installation actions are executed.
For each database configured, the list of YDB dynnode systemd services are created and configured.
YDB dynnode services are started.

Actions executed for the configuration update

YDB TLS certificates and keys are copied to each server.
YDB cluster configuration file is copied to each server.
Rolling restart is performed for YDB storage nodes, node by node, checking for the YDB storage cluster to become healthy after the restart of each node.
Rolling restart is performed for YDB database nodes, server by server, restarting all nodes sitting in the single server at a time, and waiting for the specified number of seconds after each server's nodes restart.

List of playbooks

ydb_platform.ydb.initial_setup - Install cluster from scratch
ydb_platform.ydb.binaries_all - Install YDB binaries to all nodes
ydb_platform.ydb.binaries_static - Install YDB binaries to all static/storage nodes
ydb_platform.ydb.binaries_dynamic - Install YDB binaries to all dynamic nodes
ydb_platform.ydb.restart - Restart static (weak mode) and after that dynamic nodes
ydb_platform.ydb.rolling_restart_static - Restart static nodes in weak mode
ydb_platform.ydb.rolling_restart_dynamic - Restart dynamic nodes

graph TD;
   binaries_all --all hosts--> install_ydb 
   binaries_static --static--> install_ydb 
   binaries_dynamic --dynamic--> install_ydb
   restart --1--> rolling_restart_static
   restart --2--> rolling_restart_dynamic
   subgraph install_ydb
    install_from_archive(use local archive with binaries)
    install_from_source_code(make it from source code)
    install_from_version(download from official site)
    install_from_binary(use local binaries)
   end

Install in isolated mode

Isolated mode - situation when hosts are isolated from Internet (intranet, secure environment). There two possible ways to install:

Use bastion / jump host
Use internal preconfigured host

For SberLinux9.0 package libxcrypt-compat is required. It can be placed as files/libxcrypt-compat.rpm or you can define your own URL to download it via ansible variable package_libxcrypt_url. Example

`ansible-playbook ydb_platform.ydb.initial_setup --extra-vars "package_libxcrypt_url=https://localrepo/AppStream/x86_64/os/Packages/libxcrypt-compat-4.4.18-3.el9.x86_64.rpm"`

Install with bastion

The procedure of install is just the same like common install. But there're some limitations and recomendations.

Required settings in inventory (50-inventory.yaml)

    ansible_user: bastion_username
    ansible_ssh_common_args: "-o StrictHostKeyChecking=no -o User=node_username -A -J bastion_username@{{ lookup('env','JUMP_IP') }}"
    # This key must work with all nodes (bastion and YDB hosts)
    # Or you must specify for hosts specific private key in ansible_ssh_common_args
    ansible_ssh_private_key_file: "~/.ssh/id_rsa"

YDB Dstool must be installed from binary (50-inventory.yaml)

    ydb_dstool_binary: "{{ ansible_config_file | dirname }}/files/ydb-dstool"

graph LR
  subgraph host
    ansible
    ydbops
  end
  subgraph Internal Network
    jump
    node01
    node02
    node03
  end
  host --22/tcp--> jump
  jump --22/tcp--> node01
  jump --22/tcp--> node02
  jump --22/tcp--> node03

WARNING: Cluster restart doesn't work with bastion without direct access to FQDN nodes via 2135/tcp.

Install with preconfigured host

Prepare binaries:
- create docker image for Ansible by using Dockerfile (Internet connection is required, for example: docker build . -t ydb-ansible) and save it as a binary file (docker save ydb-ansible -o ydb-ansible.image)
- download YDB archive or build binaries from sources
- download YDB Dstool (or create it from sources)
- download YDBOps (https://github.com/ydb-platform/ydbops/releases)
Prepare host for ansible:
- install Docker (For example, apt install docker.io)
- install YDB docker image (docker load -i ydb-ansible.image)
- upload binary files (ydb, ydb-dstool, ydbd, ydbops)
Prepare ansible configuration as it is described above (TLS certificates, config, inventory). Required settings in inventory (50-inventory.yaml)

        ydb_tls_dir: "{{ ansible_config_file | dirname }}/TLS/CA/certs/2024-11-21_09-07-03"
        ydbd_binary: "{{ ansible_config_file | dirname }}/files/ydbd"
        ydb_cli_binary: "{{ ansible_config_file | dirname }}/files/ydb"
        ydb_version: "24.4.1"
        ydbops_binary: "{{ ansible_config_file | dirname }}/files/ydbops"
        ydb_dstool_binary: "{{ ansible_config_file | dirname }}/files/ydb-dstool"

HINT: All files paths must be in ansible folder. This folder will be mounted as /ansible in docker container.

Execute playbook from the ansible folder with configured files:

sudo docker run -it --rm \
	-v $(pwd):/ansible \
	-v /home/ansible/.ssh:/root/.ssh \
	ydb-ansible ansible-playbook ydb_platform.ydb.initial_setup

Control hosts by ansible via console

sudo docker run -it --rm \
        -v $(pwd):/ansible \
        -v /home/ansible/.ssh:/root/.ssh \
        ydb-ansible ansible-console ydb

Use different ansible collection

You can download another version of YDB Ansible collection or get official archive and change it in your own way.

git clone https://github.com/ydb-platform/ydb-ansible /home/ansible/ydb-ansible
cd /home/ansible/ydb-ansible
git checkout SOMEBRANCH

sudo docker run -it --rm \
        -v $(pwd):/ansible \
        -v /home/ansible/.ssh:/root/.ssh \
        -v /home/ansible/ydb-ansible:/root/.ansible/collections/ansible_collections/ydb_platform/ydb \
        ydb-ansible ansible-console ydb

Install with separated networks

It's possible to use separated networks for YDB cluster:

front-end network - for communication between YDB clients and YDB cluster
back-end network - for inter-communications between YDB cluster nodes

graph LR
  subgraph client
  end
  subgraph node01
    node01-front-fqdn
    node01-back-fqdn
  end
  subgraph node02
    node02-front-fqdn
    node02-back-fqdn
  end
  subgraph node03
    node03-front-fqdn
    node03-back-fqdn
  end

  node01-back-fqdn <--> node02-back-fqdn
  node01-back-fqdn <--> node03-back-fqdn
  node03-back-fqdn <--> node02-back-fqdn
  client --> node01-front-fqdn
  client --> node02-front-fqdn
  client --> node03-front-fqdn

Inventory for separated networks

First of all, back-end network is main network for the cluster. That's why back-end FQDN must be configured as hostnames for the nodes. Fron-end FQDN must be defined as host-variable ydb_front. Also it's possible to define NodeId via ydb_back_number variable. List of brokers is important part for dynamic nodes and it must contain back-end FQDN.

Example. Inventory part for nodes

all:
  children:
    ydb:
      hosts:
        ydb-node01.back.ru-central1.internal:
            ydb_front: ydb-node01.front.ru-central1.internal
            ydb_back_number: 1
        ydb-node02.back.ru-central1.internal: 
            ydb_front: ydb-node02.front.ru-central1.internal
            ydb_back_number: 2
        ydb-node03.back.ru-central1.internal: 
            ydb_front: ydb-node03.front.ru-central1.internal
            ydb_back_number: 3

Example. Inventory part for brokers

        ydb_brokers:
          - ydb-node01.back.ru-central1.internal
          - ydb-node02.back.ru-central1.internal
          - ydb-node03.back.ru-central1.internal

Cluster config

In config only back-end FQDN are used

hosts:
- host: ydb-node01.back.ru-central1.internal
  host_config_id: 1
  location:
    unit: srv001
    data_center: YDB1
    rack: RACK1
- host: ydb-node02.back.ru-central1.internal
  host_config_id: 1
  location:
    unit: srv002
    data_center: YDB1
    rack: RACK1
...

SSL Certificates (Optional)

It's required to generate certificated for FQDN in both networks if GRPCS is used.

Example. ydb-ca-nodes.txt for generating certificates

ydb-node01 ydb-node01.front.ru-central1.internal ydb-node01.back.ru-central1.internal
ydb-node02 ydb-node02.front.ru-central1.internal ydb-node02.back.ru-central1.internal
ydb-node03 ydb-node03.front.ru-central1.internal ydb-node03.back.ru-central1.internal

Expand cluster

There are two possible ways to add new nodes into the cluster:

Simple - use initial_setup playbook
Long - use several playbooks

Simple

Update config.yaml - add new nodes into hosts section.

Example. Before changes

hosts:
- host: ydb-node01.ru-central1.internal
  host_config_id: 1
  location:
    unit: srv001
    data_center: YDB1
    rack: RACK1
- host: ydb-node02.ru-central1.internal
  host_config_id: 1
  location:
    unit: srv002
    data_center: YDB1
    rack: RACK1
...

Example. After changes

hosts:
- host: ydb-node01.ru-central1.internal
  host_config_id: 1
  location:
    unit: srv001
    data_center: YDB1
    rack: RACK1
- host: ydb-node02.ru-central1.internal
  host_config_id: 1
  location:
    unit: srv002
    data_center: YDB1
    rack: RACK1
...
- host: ydb-node-NEW.ru-central1.internal
  host_config_id: 100
  location:
    unit: srv100
    data_center: YDB3
    rack: RACK10

Generate SSL certificates for new nodes
Update configs on the current nodes and restart cluster ansible-playbook ydb_platform.ydb.update_config
Add new nodes into inventory

all:
  children:
    ydb:
      hosts:
        ydb-node01.ru-central1.internal:
        ydb-node02.ru-central1.internal: 
        ydb-node03.ru-central1.internal: 
        ydb-node-NEW.ru-central1.internal:

Install YDB on new nodes and start them ansible-playbook ydb_platform.ydb.initial_setup -l ydb-node-NEW.ru-central1.internal --skip-tags password,create_database
Check the cluster

Long

Update config.yaml - add new nodes into hosts section.

Example. Before changes

hosts:
- host: ydb-node01.ru-central1.internal
  host_config_id: 1
  location:
    unit: srv001
    data_center: YDB1
    rack: RACK1
- host: ydb-node02.ru-central1.internal
  host_config_id: 1
  location:
    unit: srv002
    data_center: YDB1
    rack: RACK1
...

Example. After changes

hosts:
- host: ydb-node01.ru-central1.internal
  host_config_id: 1
  location:
    unit: srv001
    data_center: YDB1
    rack: RACK1
- host: ydb-node02.ru-central1.internal
  host_config_id: 1
  location:
    unit: srv002
    data_center: YDB1
    rack: RACK1
...
- host: ydb-node-NEW.ru-central1.internal
  host_config_id: 100
  location:
    unit: srv100
    data_center: YDB3
    rack: RACK10

Generate SSL certificates for new nodes
Update configs on the current nodes and restart cluster ansible-playbook ydb_platform.ydb.update_config
Add new nodes into inventory

all:
  children:
    ydb:
      hosts:
        ydb-node01.ru-central1.internal:
        ydb-node02.ru-central1.internal: 
        ydb-node03.ru-central1.internal: 
        ydb-node-NEW.ru-central1.internal:

Prepare nodes for YDB: ydb_platform.ydb.prepare_host -l ydb-node-NEW.ru-central1.internal
Install YDB on new static nodes and start them ydb_platform.ydb.install_static -l ydb-node-NEW.ru-central1.internal --skip-tags password,create_database
Install YDB on new dynamic nodes and start them ydb_platform.ydb.install_dynamic -l ydb-node-NEW.ru-central1.internal --skip-tags password,create_database
Check the cluster

FAQ

Q: How to install on Linux with kernel 5.15.0-1073-kvm, which does not contain the tcp_htcp module? A1: define empty variable ydb_congestion_module in inventory A2: define variable in command line: ansible-playbook ydb_platform.ydb.initial_setup --extra-vars "ydb_congestion_module="
Q: How to handle error: aborting playbook execution. Stop running YDB instances? A1: Manually stop YDB instances on the hosts for new YDB installation A2: Use other hosts without YDB A3: Use ansible-console to stop YDB instances:
```
ansible-console ydb
$ sudo systemctl stop ydbd-storage
$ sudo systemctl stop ydbd-database-a
$ sudo systemctl stop ydbd-database-b
```

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
meta		meta
playbooks		playbooks
plugins		plugins
roles		roles
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
galaxy.yml		galaxy.yml
pre-collections-convert.py		pre-collections-convert.py
requirements.yaml		requirements.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Deploying YDB cluster with Ansible

Supported operation systems

Ansible Collection - ydb_platform.ydb

Playbook configuration settings

Installing the YDB cluster using Ansible

Updating the cluster configuration files

What is actually done by the playbooks?

Actions executed for installing YDB nodes

Actions executed for the initial deployment of YDB storage cluster

Actions executed for YDB dynamic nodes deployment

Actions executed for the configuration update

List of playbooks

Install in isolated mode

Install with bastion

Install with preconfigured host

Control hosts by ansible via console

Use different ansible collection

Install with separated networks

Inventory for separated networks

Cluster config

SSL Certificates (Optional)

Expand cluster

Simple

Long

FAQ

About

Uh oh!

Releases 17

Packages

Uh oh!

Contributors 9

Uh oh!

Languages

License

ydb-platform/ydb-ansible

Folders and files

Latest commit

History

Repository files navigation

Deploying YDB cluster with Ansible

Supported operation systems

Ansible Collection - ydb_platform.ydb

Playbook configuration settings

Installing the YDB cluster using Ansible

Updating the cluster configuration files

What is actually done by the playbooks?

Actions executed for installing YDB nodes

Actions executed for the initial deployment of YDB storage cluster

Actions executed for YDB dynamic nodes deployment

Actions executed for the configuration update

List of playbooks

Install in isolated mode

Install with bastion

Install with preconfigured host

Control hosts by ansible via console

Use different ansible collection

Install with separated networks

Inventory for separated networks

Cluster config

SSL Certificates (Optional)

Expand cluster

Simple

Long

FAQ

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 17

Packages 0

Uh oh!

Contributors 9

Uh oh!

Languages

Packages