Skip to content
This repository was archived by the owner on Apr 8, 2024. It is now read-only.

Add tutorial on how to remove nodes from a cluster #159

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
360 changes: 360 additions & 0 deletions docs/clustering/downscaling.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,360 @@
.. _scaling-down:

===========
Downscaling
===========

In this how-to guide you will:

- Create a three node cluster which will run on a single host.
- Add some data to it.
- Scale it down to a single node cluster.

.. _scaling-down-starting-vanilla-cluster:

vanilla cluster
===============

Our cluster's name is ``vanilla``, it is composed of three CrateDB nodes, all running
on the same host, and sharing the file system, operating system's scheduler, memory
manager, and hardware.

This setup provides a local cluster environment, when you only have one host, at the
cost of increased latency in writes, because when operating as a cluster, the nodes
must reach consensus on each write operation.
Comment on lines +22 to +24
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence confuses me. What is the intention here? Expressing that there are no benefits of running a cluster of multiple nodes on just 1 node but only downsides?
If so I suggest to rephrase and using shorter sentences instead of one really long confusing one.

Suggestion:

This setup provides a local-only cluster environment for demonstration purpose only.

.. Note:

     Running multiple nodes, a full cluster, on 1 host only won't result in any benefits like resiliency or load 
     balancing over a single-node cluster. Instead performance will decrease due to internal node-to-node 
     communication like discovery, write-consensus, coordinator-election, etc.


To set it up I recommend any of our `deployment guides`_, using this template for each
of the nodes:

::

cluster.name: vanilla
node.name: <NODE_NAME>
network.host: _local_
node.max_local_storage_nodes: 3
Copy link
Member

@seut seut Aug 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we'll probably backport this commit soon or later elastic/elasticsearch#42428 which will remove this setting, I'd suggest to not to use this setting here anymore but configure a dedicated path.data for each node instead.

stats.service.interval: 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this setting relevant here?


http.cors.enabled: true
http.cors.allow-origin: "*"

transport.tcp.port: <PORT>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it needed to define this setting? afaik the default works fine.
Otherwise the settings for discovery.seed_hosts and cluster.initial_master_nodes would have to be adjusted as well.

gateway.expected_nodes: 3
gateway.recover_after_nodes: 3
discovery.seed_hosts:
- 127.0.0.1:4301
- 127.0.0.1:4302
- 127.0.0.1:4303
cluster.initial_master_nodes:
- 127.0.0.1:4301
- 127.0.0.1:4302
- 127.0.0.1:4303

where ``<PORT>`` is one of ``[4301, 4302, 4303]``, and ``<NODE_NAME>`` one of ``[n1, n1, n3]``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As said, no need for <PORT>.


The settings come explained in cluster-wide-settings_ and node-specific-settings_.


Installing from sources
=======================

An alternative way of installing CrateDB is from its sources. This script summarizes
the procedure:

::

#!/bin/sh

# Forder 'dist' will contain the CrateDB git clone 'crate-clone'.
# The clone will be used to build a tarball, which will be
# cracked open. A soft link `crate` will be created to point
# to the CrateDB distribution's root folder.

if [ ! -d dist ]; then
mkdir dist
fi

# get the clone
if [ ! -d dist/crate-clone ]; then
git clone https://github.com/crate/crate.git dist/crate-clone
fi
cd dist/crate-clone

# delete any old tarballs
find ./app/build/distributions -name 'crate-*.tar.gz' -exec rm -f {} \;

# build the tarball
git pull
./gradlew clean disTar

# crack open the tarball
latest_tar_ball=$(find ./app/build/distributions -name 'crate-*.tar.gz')
cp $latest_tar_ball ..
cd ..
name=$(basename $latest_tar_ball)
tar -xzvf $name
rm -f $name
name=${name/.tar.gz/}
cd ..

# create the symlink to the distribution folder
rm -f crate
ln -s dist/$name crate
echo "Crate $name has been installed."


Once CrateDB is installed, you can create a folder ``conf`` at the same level as
the ``crate`` soft link. This folder should contain a subfolder per node, named
the same as the node ``[n1, n2, n3]``, where you can place the ``crate.yml`` file,
as well as the ``log4j2.properties`` logger configuration file (you will find a
copy inside the ``config`` folder within the CrateDB installation).

To start each node ``[n1, n2, n3]`` you can run a script (once per node) similar to:

::

node_name=$1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the shebang should be added, people will probably copy-paste it and then it won't work. Aslo it clarifies which language this script is written in.
Additionally maybe use the related script RST highlighter?


path_conf="$(pwd)/conf/$node_name"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why $(pwd)/conf/$node_name? where was it mentioned that these folders were created for each node?

path_home=$(pwd)/crate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would apply maybe for source installation but not if installed via packages, e.g on Debian crate's home is /usr/share/crate.

path_data=$(pwd)/data/$node_name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again this require an existing $(pwd)/data folder. I would rather put it inside $(pwd)/crate or call it more concrete like $(pwd)/cratedb_data as I would not create a generic data folder e.g. inside my home dir.

path_snapshots=$(pwd)/data/$node_name/snapshots
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this tutorial no snapshots are used at all, so this is not needed.


export CRATE_HOME=$path_home
if [ -z "$CRATE_HEAP_SIZE" ]; then
export CRATE_HEAP_SIZE="8G"
fi

./crate/bin/crate -Cpath.conf=$path_conf \
-Cpath.data=$path_data \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As wrote above, this setting should be part of the config and thus can be remove here.

-Cpath.repo=$path_snapshots \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as said, not needed as no snapshots are used

-Cnode.name=<NODE_NAME> \
-Ctransport.tcp.port=<PORT>
Comment on lines +130 to +131
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both settings are defined inside the crate.yml template above and thus not needed.


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth adding an example how to create the cluster using that script?
Do people understand that they have to provide a node name as the first argument?
Maybe one example is enough.


which will form the ``vanilla cluster``, electing a master.

You can interact with the cluster by opening a browser and pointing it to
*http://localhost:4200*, CrateDB's `Admin UI`_.


.. _scaling-down-adding-data:

Adding some data
================

If you would like to add some data, I recommend to follow the `generate time series data`_
tutorial, which will give you more tools and experience with CrateDB.

As an alternative, you can produce a CSV_ file **logs.csv** with a script such as:

::

import random
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what language is this script written in? adding a shebang would help, same arguments as above.

import string
import ipaddress
import time


# to achieve log lines as in:
# 2012-01-01T00:00:00Z,25.152.171.147,/crate/Five_Easy_Pieces.html,200,280278
# -> timestamp
# -> random ip address
# -> random request (a path)
# -> random status code
# -> random object size


def timestamp_range(start, end, format):
st = int(time.mktime(time.strptime(start, format)))
et = int(time.mktime(time.strptime(end, format)))
dt = 1 # 1 sec
fmt = lambda x: time.strftime(format, time.localtime(x))
return (fmt(x) for x in range(st, et, dt))


def rand_ip():
return str(ipaddress.IPv4Address(random.getrandbits(32)))


def rand_request():
rand = lambda src: src[random.randint(0, len(src) - 1)]
path = lambda: "/".join((rand(("usr", "bin", "workspace", "temp", "home", "crate"))) for _ in range(4))
name = lambda: ''.join(random.sample(string.ascii_lowercase, 7))
ext = lambda: rand(("html", "pdf", "log", "gif", "jpeg", "js"))
return "{}/{}.{}".format(path(), name(), ext())


def rand_object_size():
return str(random.randint(0, 1024))


def rand_status_code():
return str(random.randint(100, 500))


if __name__ == "__main__":
print("log_time,client_ip,request,status_code,object_size")
for ts in timestamp_range("2019-01-01T00:00:00Z", "2019-01-01T01:00:00Z", '%Y-%m-%dT%H:%M:%SZ'):
print(",".join([ts, rand_ip(), rand_request(), rand_status_code(), rand_object_size()]))


which requires the presence of a ``logs`` table. In the `Admin UI`_:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What to do In the AdminUI, where to copy-paste this snippet?


::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use the RST SQL highlighter?


CREATE TABLE logs (log_time timestamp NOT NULL,
client_ip ip NOT NULL,
request string NOT NULL,
status_code short NOT NULL,
object_size long NOT NULL);

COPY logs FROM 'file:/// logs.csv';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The console of the AdminUI does not support multiple SQL commands.

REFRESH TABLE logs;
select * from logs order by log_time limit 10800;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
select * from logs order by log_time limit 10800;
SELECT * FROM logs ORDER BY log_time LIMIT 10800;


The three nodes perform the copy operation (remember, we are operating as a cluster),
so you are expecting to see 3600 * 3 rows inserted, of what looks like "repeated" data.
Because a primary key was not defined, CrateDB created the default *_id* primary
key for each row, and this was done at each node. The result is that each node inserted
a row per line in the csv file, with a cluster wide unique default *_id*, and this is
perceived as a triplication of the data. If you do not want to see triplication,
you can define a primary key.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a table definition with a PK would make this section obsolete. The PK issue is out of scope of this tutorial and just confuses.


.. _scaling-down-exploring-the-data:

Exploring the Data
==================

Using the `Admin UI`_, shards view on the left:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Using the `Admin UI`_, shards view on the left:
Using the `Admin UI`_ shards view on the left:


.. image:: shards-view.png

You can see the three nodes, with each having a number of shards, like so:

+-------+---+---+---+---+---+---+
| Shard | 0 | 1 | 2 | 3 | 4 | 5 |
+=======+===+===+===+===+===+===+
| n1 | . | . | . | | . | |
+-------+---+---+---+---+---+---+
| n2 | . | . | | . | | . |
+-------+---+---+---+---+---+---+
| n3 | | | . | . | . | . |
+-------+---+---+---+---+---+---+

Thus in this cluster setup, one node can crash; yet the data in the cluster will still
remain fully available because any two nodes have access to all the shards when they
work together to fulfill query requests. A SQL table is a composite of shards (six in
our case). When a query is executed, the planner will define steps for accessing all
the shards of the table. By adding nodes to the cluster, or shards to the table, or both,
the data is spread over more nodes, so that the computing is parallelized.

Having a look at the setup for table *logs*:

::

SHOW CREATE TABLE logs;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I'm having a look.... where shall I look? ;-)

Adding a comment that this shall be executed inside e.g. the AdminUI console would clarify it.
Another approach would be to mention at the top of this document that all SQL statements in this tutorial should be pasted and executed in the AdminUI's SQL console.

Also RST SQL highlighting is missing.


Will return:

::

CREATE TABLE IF NOT EXISTS "doc"."logs" (
"log_time" TIMESTAMP WITH TIME ZONE NOT NULL,
"client_ip" IP NOT NULL,
"request" TEXT NOT NULL,
"status_code" SMALLINT NOT NULL,
"object_size" BIGINT NOT NULL
)
CLUSTERED INTO 6 SHARDS
WITH (

number_of_replicas = '0-1',

)

You have a default min number of replicas of zero, and a max of one for each
of our six shards. A replica is simply a copy of a shard.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a default min number of replicas of zero, and a max of one for each of our six shards.

what does that mean? I'm confused...

I'd suggest to either eliminate this point by defining a concrete number replicas on table creation or at least link to the documentation which explains what 0-1 means.



.. _scaling-down-downscaling:

Downscaling to a single node cluster
====================================

Scaling down to a single node cluster is the extreme example. In general
downscaling is achieved by making sure the surviving nodes of the cluster have
access to all the shards, even when the other nodes are missing.

The first step is to ensure that the number of replicas matches the number of
nodes, so that all nodes have access to all the shards:

::

ALTER TABLE logs SET (number_of_replicas = '1-all');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ALTER TABLE logs SET (number_of_replicas = '1-all');
ALTER TABLE logs SET (number_of_replicas = 2);


In the `Admin UI`_, you can follow the progress of replication.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can I do that?


When replication is completed, you need to take down all the nodes in the cluster,
as you are going to externally affect is state by means of the crate-node-tool_. You
will first detach nodes [n2, n3], and then will bootstrap node [n1]. For convenience
here is how each operation is invoked from the command line:

::

./crate/bin/crate-node detach-cluster -Cpath.home=$(pwd)/crate \
-Cpath.conf=$path_conf \
-Cpath.data=$path_data \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This setting must point to the concrete data path of each node, this should be explained/highlighted.

-Cpath.repo=$path_snapshots \
-Cnode.name=<NODE_NAME> \
-Ctransport.tcp.port=<PORT>
Comment on lines +307 to +309
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These settings are not needed by the crate-node tool (and also defined inside the template, or not needed at all like the repo.path)



./crate/bin/crate-node unsafe-bootstrap -Cpath.home=$(pwd)/crate \
-Cpath.conf=$path_conf \
-Cpath.data=$path_data \
-Cpath.repo=$path_snapshots \
-Cnode.name=<NODE_NAME> \
-Ctransport.tcp.port=<PORT>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not add here the concrete steps/ shell commands to execute to scale down to 1 node?
Like:

  • detach node 3 from the cluster:
    ... detach-cluster ...
  • detach node 2 from the cluster:
    ... detach-cluster ...
  • tell node 1 that it should forget about node 2 and 3:
    ... unsafe-bootstrap ...

I think this is one of the most interesting section for this tutorial



The best practice is to select the node that was master in the cluster, to be the single
node in the new cluster, as then we know it had the latest version of the cluster state.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I know which node was the master? An example/description how to gather that information would be very helpful.

For this tutorial, we are running in a single host so the cluster state is more or less
guaranteed to be consistent across all nodes. In principle, however, the cluster could
be running across multiple hosts, and then we would want the master node to become the
new single node cluster.

The new configuration for the single node:

::

cluster.name: simple # don't really need to change this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the cluster name now different and why don't I need to change it?

Suggested change
cluster.name: simple # don't really need to change this
cluster.name: vanilla

node.name: n1
stats.service.interval: 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, this setting is not relevant, so I'd suggest to drop it.

network.host: _local_
node.max_local_storage_nodes: 1

http.cors.enabled: true
http.cors.allow-origin: "*"

transport.tcp.port: 4301
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as said, not needed.



Now you can start **n1**. The new single node cluster forms and transitions to *[YELLOW]*
state. We sort that out with:

::

ALTER TABLE logs SET (number_of_replicas = '0-1');


.. _crate-howtos: https://github.com/crate/crate-howtos
.. _GitHub: https://github.com/crate/crate.git
.. _cluster-wide-settings: https://crate.io/docs/crate/reference/en/latest/config/cluster.html
.. _node-specific-settings: https://crate.io/docs/crate/reference/en/latest/config/node.html
.. _`Admin UI`: http://localhost:4200
.. _crate-node: https://crate.io/docs/crate/reference/en/latest/cli-tools.html#cli-crate-node
.. _CSV: https://en.wikipedia.org/wiki/Comma-separated_values
.. _crate-node-tool: https://crate.io/docs/crate/guide/en/latest/best-practices/crate-node.html
.. _`deployment guides` : https://crate.io/docs/crate/howtos/en/latest/deployment/index.html
.. _`generate time series data`: https://crate.io/docs/crate/tutorials/en/latest/generate-time-series/index.html
1 change: 1 addition & 0 deletions docs/clustering/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,4 @@ This section of the documentation shows you how to cluster and scale CrateDB.
multi-node-setup
multi-zone-setup
kubernetes
downscaling
Binary file added docs/clustering/shards-view.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.