-
Notifications
You must be signed in to change notification settings - Fork 6
Add tutorial on how to remove nodes from a cluster #159
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,360 @@ | ||||||
.. _scaling-down: | ||||||
|
||||||
=========== | ||||||
Downscaling | ||||||
=========== | ||||||
|
||||||
In this how-to guide you will: | ||||||
|
||||||
- Create a three node cluster which will run on a single host. | ||||||
- Add some data to it. | ||||||
- Scale it down to a single node cluster. | ||||||
|
||||||
.. _scaling-down-starting-vanilla-cluster: | ||||||
|
||||||
vanilla cluster | ||||||
=============== | ||||||
|
||||||
Our cluster's name is ``vanilla``, it is composed of three CrateDB nodes, all running | ||||||
on the same host, and sharing the file system, operating system's scheduler, memory | ||||||
manager, and hardware. | ||||||
|
||||||
This setup provides a local cluster environment, when you only have one host, at the | ||||||
cost of increased latency in writes, because when operating as a cluster, the nodes | ||||||
must reach consensus on each write operation. | ||||||
|
||||||
To set it up I recommend any of our `deployment guides`_, using this template for each | ||||||
of the nodes: | ||||||
|
||||||
:: | ||||||
|
||||||
cluster.name: vanilla | ||||||
node.name: <NODE_NAME> | ||||||
network.host: _local_ | ||||||
node.max_local_storage_nodes: 3 | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As we'll probably backport this commit soon or later elastic/elasticsearch#42428 which will remove this setting, I'd suggest to not to use this setting here anymore but configure a dedicated |
||||||
stats.service.interval: 0 | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why is this setting relevant here? |
||||||
|
||||||
http.cors.enabled: true | ||||||
http.cors.allow-origin: "*" | ||||||
|
||||||
transport.tcp.port: <PORT> | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why is it needed to define this setting? afaik the default works fine. |
||||||
gateway.expected_nodes: 3 | ||||||
gateway.recover_after_nodes: 3 | ||||||
discovery.seed_hosts: | ||||||
- 127.0.0.1:4301 | ||||||
- 127.0.0.1:4302 | ||||||
- 127.0.0.1:4303 | ||||||
cluster.initial_master_nodes: | ||||||
- 127.0.0.1:4301 | ||||||
- 127.0.0.1:4302 | ||||||
- 127.0.0.1:4303 | ||||||
|
||||||
where ``<PORT>`` is one of ``[4301, 4302, 4303]``, and ``<NODE_NAME>`` one of ``[n1, n1, n3]``. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As said, no need for |
||||||
|
||||||
The settings come explained in cluster-wide-settings_ and node-specific-settings_. | ||||||
|
||||||
|
||||||
Installing from sources | ||||||
======================= | ||||||
|
||||||
An alternative way of installing CrateDB is from its sources. This script summarizes | ||||||
the procedure: | ||||||
|
||||||
:: | ||||||
|
||||||
#!/bin/sh | ||||||
|
||||||
# Forder 'dist' will contain the CrateDB git clone 'crate-clone'. | ||||||
# The clone will be used to build a tarball, which will be | ||||||
# cracked open. A soft link `crate` will be created to point | ||||||
# to the CrateDB distribution's root folder. | ||||||
|
||||||
if [ ! -d dist ]; then | ||||||
mkdir dist | ||||||
fi | ||||||
|
||||||
# get the clone | ||||||
if [ ! -d dist/crate-clone ]; then | ||||||
git clone https://github.com/crate/crate.git dist/crate-clone | ||||||
fi | ||||||
cd dist/crate-clone | ||||||
|
||||||
# delete any old tarballs | ||||||
find ./app/build/distributions -name 'crate-*.tar.gz' -exec rm -f {} \; | ||||||
|
||||||
# build the tarball | ||||||
git pull | ||||||
./gradlew clean disTar | ||||||
|
||||||
# crack open the tarball | ||||||
latest_tar_ball=$(find ./app/build/distributions -name 'crate-*.tar.gz') | ||||||
cp $latest_tar_ball .. | ||||||
cd .. | ||||||
name=$(basename $latest_tar_ball) | ||||||
tar -xzvf $name | ||||||
rm -f $name | ||||||
name=${name/.tar.gz/} | ||||||
cd .. | ||||||
|
||||||
# create the symlink to the distribution folder | ||||||
rm -f crate | ||||||
ln -s dist/$name crate | ||||||
echo "Crate $name has been installed." | ||||||
|
||||||
|
||||||
Once CrateDB is installed, you can create a folder ``conf`` at the same level as | ||||||
the ``crate`` soft link. This folder should contain a subfolder per node, named | ||||||
the same as the node ``[n1, n2, n3]``, where you can place the ``crate.yml`` file, | ||||||
as well as the ``log4j2.properties`` logger configuration file (you will find a | ||||||
copy inside the ``config`` folder within the CrateDB installation). | ||||||
|
||||||
To start each node ``[n1, n2, n3]`` you can run a script (once per node) similar to: | ||||||
|
||||||
:: | ||||||
|
||||||
node_name=$1 | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the shebang should be added, people will probably copy-paste it and then it won't work. Aslo it clarifies which language this script is written in. |
||||||
|
||||||
path_conf="$(pwd)/conf/$node_name" | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why |
||||||
path_home=$(pwd)/crate | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This would apply maybe for source installation but not if installed via packages, e.g on Debian crate's home is |
||||||
path_data=$(pwd)/data/$node_name | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again this require an existing |
||||||
path_snapshots=$(pwd)/data/$node_name/snapshots | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In this tutorial no snapshots are used at all, so this is not needed. |
||||||
|
||||||
export CRATE_HOME=$path_home | ||||||
if [ -z "$CRATE_HEAP_SIZE" ]; then | ||||||
export CRATE_HEAP_SIZE="8G" | ||||||
fi | ||||||
|
||||||
./crate/bin/crate -Cpath.conf=$path_conf \ | ||||||
-Cpath.data=$path_data \ | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As wrote above, this setting should be part of the config and thus can be remove here. |
||||||
-Cpath.repo=$path_snapshots \ | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. as said, not needed as no snapshots are used |
||||||
-Cnode.name=<NODE_NAME> \ | ||||||
-Ctransport.tcp.port=<PORT> | ||||||
Comment on lines
+130
to
+131
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. both settings are defined inside the |
||||||
|
||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe worth adding an example how to create the cluster using that script? |
||||||
|
||||||
which will form the ``vanilla cluster``, electing a master. | ||||||
|
||||||
You can interact with the cluster by opening a browser and pointing it to | ||||||
*http://localhost:4200*, CrateDB's `Admin UI`_. | ||||||
|
||||||
|
||||||
.. _scaling-down-adding-data: | ||||||
|
||||||
Adding some data | ||||||
================ | ||||||
|
||||||
If you would like to add some data, I recommend to follow the `generate time series data`_ | ||||||
tutorial, which will give you more tools and experience with CrateDB. | ||||||
|
||||||
As an alternative, you can produce a CSV_ file **logs.csv** with a script such as: | ||||||
|
||||||
:: | ||||||
|
||||||
import random | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what language is this script written in? adding a shebang would help, same arguments as above. |
||||||
import string | ||||||
import ipaddress | ||||||
import time | ||||||
|
||||||
|
||||||
# to achieve log lines as in: | ||||||
# 2012-01-01T00:00:00Z,25.152.171.147,/crate/Five_Easy_Pieces.html,200,280278 | ||||||
# -> timestamp | ||||||
# -> random ip address | ||||||
# -> random request (a path) | ||||||
# -> random status code | ||||||
# -> random object size | ||||||
|
||||||
|
||||||
def timestamp_range(start, end, format): | ||||||
st = int(time.mktime(time.strptime(start, format))) | ||||||
et = int(time.mktime(time.strptime(end, format))) | ||||||
dt = 1 # 1 sec | ||||||
fmt = lambda x: time.strftime(format, time.localtime(x)) | ||||||
return (fmt(x) for x in range(st, et, dt)) | ||||||
|
||||||
|
||||||
def rand_ip(): | ||||||
return str(ipaddress.IPv4Address(random.getrandbits(32))) | ||||||
|
||||||
|
||||||
def rand_request(): | ||||||
rand = lambda src: src[random.randint(0, len(src) - 1)] | ||||||
path = lambda: "/".join((rand(("usr", "bin", "workspace", "temp", "home", "crate"))) for _ in range(4)) | ||||||
name = lambda: ''.join(random.sample(string.ascii_lowercase, 7)) | ||||||
ext = lambda: rand(("html", "pdf", "log", "gif", "jpeg", "js")) | ||||||
return "{}/{}.{}".format(path(), name(), ext()) | ||||||
|
||||||
|
||||||
def rand_object_size(): | ||||||
return str(random.randint(0, 1024)) | ||||||
|
||||||
|
||||||
def rand_status_code(): | ||||||
return str(random.randint(100, 500)) | ||||||
|
||||||
|
||||||
if __name__ == "__main__": | ||||||
print("log_time,client_ip,request,status_code,object_size") | ||||||
for ts in timestamp_range("2019-01-01T00:00:00Z", "2019-01-01T01:00:00Z", '%Y-%m-%dT%H:%M:%SZ'): | ||||||
print(",".join([ts, rand_ip(), rand_request(), rand_status_code(), rand_object_size()])) | ||||||
|
||||||
|
||||||
which requires the presence of a ``logs`` table. In the `Admin UI`_: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What to do In the |
||||||
|
||||||
:: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we use the RST SQL highlighter? |
||||||
|
||||||
CREATE TABLE logs (log_time timestamp NOT NULL, | ||||||
client_ip ip NOT NULL, | ||||||
request string NOT NULL, | ||||||
status_code short NOT NULL, | ||||||
object_size long NOT NULL); | ||||||
|
||||||
COPY logs FROM 'file:/// logs.csv'; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The console of the AdminUI does not support multiple SQL commands. |
||||||
REFRESH TABLE logs; | ||||||
select * from logs order by log_time limit 10800; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
The three nodes perform the copy operation (remember, we are operating as a cluster), | ||||||
so you are expecting to see 3600 * 3 rows inserted, of what looks like "repeated" data. | ||||||
Because a primary key was not defined, CrateDB created the default *_id* primary | ||||||
key for each row, and this was done at each node. The result is that each node inserted | ||||||
a row per line in the csv file, with a cluster wide unique default *_id*, and this is | ||||||
perceived as a triplication of the data. If you do not want to see triplication, | ||||||
you can define a primary key. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Using a table definition with a PK would make this section obsolete. The PK issue is out of scope of this tutorial and just confuses. |
||||||
|
||||||
.. _scaling-down-exploring-the-data: | ||||||
|
||||||
Exploring the Data | ||||||
================== | ||||||
|
||||||
Using the `Admin UI`_, shards view on the left: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
.. image:: shards-view.png | ||||||
|
||||||
You can see the three nodes, with each having a number of shards, like so: | ||||||
|
||||||
+-------+---+---+---+---+---+---+ | ||||||
| Shard | 0 | 1 | 2 | 3 | 4 | 5 | | ||||||
+=======+===+===+===+===+===+===+ | ||||||
| n1 | . | . | . | | . | | | ||||||
+-------+---+---+---+---+---+---+ | ||||||
| n2 | . | . | | . | | . | | ||||||
+-------+---+---+---+---+---+---+ | ||||||
| n3 | | | . | . | . | . | | ||||||
+-------+---+---+---+---+---+---+ | ||||||
|
||||||
Thus in this cluster setup, one node can crash; yet the data in the cluster will still | ||||||
remain fully available because any two nodes have access to all the shards when they | ||||||
work together to fulfill query requests. A SQL table is a composite of shards (six in | ||||||
our case). When a query is executed, the planner will define steps for accessing all | ||||||
the shards of the table. By adding nodes to the cluster, or shards to the table, or both, | ||||||
the data is spread over more nodes, so that the computing is parallelized. | ||||||
|
||||||
Having a look at the setup for table *logs*: | ||||||
|
||||||
:: | ||||||
|
||||||
SHOW CREATE TABLE logs; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok I'm having a look.... where shall I look? ;-) Adding a comment that this shall be executed inside e.g. the AdminUI console would clarify it. Also RST SQL highlighting is missing. |
||||||
|
||||||
Will return: | ||||||
|
||||||
:: | ||||||
|
||||||
CREATE TABLE IF NOT EXISTS "doc"."logs" ( | ||||||
"log_time" TIMESTAMP WITH TIME ZONE NOT NULL, | ||||||
"client_ip" IP NOT NULL, | ||||||
"request" TEXT NOT NULL, | ||||||
"status_code" SMALLINT NOT NULL, | ||||||
"object_size" BIGINT NOT NULL | ||||||
) | ||||||
CLUSTERED INTO 6 SHARDS | ||||||
WITH ( | ||||||
|
||||||
number_of_replicas = '0-1', | ||||||
|
||||||
) | ||||||
|
||||||
You have a default min number of replicas of zero, and a max of one for each | ||||||
of our six shards. A replica is simply a copy of a shard. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
what does that mean? I'm confused... I'd suggest to either eliminate this point by defining a concrete number replicas on table creation or at least link to the documentation which explains what |
||||||
|
||||||
|
||||||
.. _scaling-down-downscaling: | ||||||
|
||||||
Downscaling to a single node cluster | ||||||
==================================== | ||||||
|
||||||
Scaling down to a single node cluster is the extreme example. In general | ||||||
downscaling is achieved by making sure the surviving nodes of the cluster have | ||||||
access to all the shards, even when the other nodes are missing. | ||||||
|
||||||
The first step is to ensure that the number of replicas matches the number of | ||||||
nodes, so that all nodes have access to all the shards: | ||||||
|
||||||
:: | ||||||
|
||||||
ALTER TABLE logs SET (number_of_replicas = '1-all'); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
In the `Admin UI`_, you can follow the progress of replication. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How can I do that? |
||||||
|
||||||
When replication is completed, you need to take down all the nodes in the cluster, | ||||||
as you are going to externally affect is state by means of the crate-node-tool_. You | ||||||
will first detach nodes [n2, n3], and then will bootstrap node [n1]. For convenience | ||||||
here is how each operation is invoked from the command line: | ||||||
|
||||||
:: | ||||||
|
||||||
./crate/bin/crate-node detach-cluster -Cpath.home=$(pwd)/crate \ | ||||||
-Cpath.conf=$path_conf \ | ||||||
-Cpath.data=$path_data \ | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This setting must point to the concrete data path of each node, this should be explained/highlighted. |
||||||
-Cpath.repo=$path_snapshots \ | ||||||
-Cnode.name=<NODE_NAME> \ | ||||||
-Ctransport.tcp.port=<PORT> | ||||||
Comment on lines
+307
to
+309
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These settings are not needed by the crate-node tool (and also defined inside the template, or not needed at all like the repo.path) |
||||||
|
||||||
|
||||||
./crate/bin/crate-node unsafe-bootstrap -Cpath.home=$(pwd)/crate \ | ||||||
-Cpath.conf=$path_conf \ | ||||||
-Cpath.data=$path_data \ | ||||||
-Cpath.repo=$path_snapshots \ | ||||||
-Cnode.name=<NODE_NAME> \ | ||||||
-Ctransport.tcp.port=<PORT> | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not add here the concrete steps/ shell commands to execute to scale down to 1 node?
I think this is one of the most interesting section for this tutorial |
||||||
|
||||||
|
||||||
The best practice is to select the node that was master in the cluster, to be the single | ||||||
node in the new cluster, as then we know it had the latest version of the cluster state. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How do I know which node was the master? An example/description how to gather that information would be very helpful. |
||||||
For this tutorial, we are running in a single host so the cluster state is more or less | ||||||
guaranteed to be consistent across all nodes. In principle, however, the cluster could | ||||||
be running across multiple hosts, and then we would want the master node to become the | ||||||
new single node cluster. | ||||||
|
||||||
The new configuration for the single node: | ||||||
|
||||||
:: | ||||||
|
||||||
cluster.name: simple # don't really need to change this | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why is the cluster name now different and why don't I need to change it?
Suggested change
|
||||||
node.name: n1 | ||||||
stats.service.interval: 0 | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. again, this setting is not relevant, so I'd suggest to drop it. |
||||||
network.host: _local_ | ||||||
node.max_local_storage_nodes: 1 | ||||||
|
||||||
http.cors.enabled: true | ||||||
http.cors.allow-origin: "*" | ||||||
|
||||||
transport.tcp.port: 4301 | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. as said, not needed. |
||||||
|
||||||
|
||||||
Now you can start **n1**. The new single node cluster forms and transitions to *[YELLOW]* | ||||||
state. We sort that out with: | ||||||
|
||||||
:: | ||||||
|
||||||
ALTER TABLE logs SET (number_of_replicas = '0-1'); | ||||||
|
||||||
|
||||||
.. _crate-howtos: https://github.com/crate/crate-howtos | ||||||
.. _GitHub: https://github.com/crate/crate.git | ||||||
.. _cluster-wide-settings: https://crate.io/docs/crate/reference/en/latest/config/cluster.html | ||||||
.. _node-specific-settings: https://crate.io/docs/crate/reference/en/latest/config/node.html | ||||||
.. _`Admin UI`: http://localhost:4200 | ||||||
.. _crate-node: https://crate.io/docs/crate/reference/en/latest/cli-tools.html#cli-crate-node | ||||||
.. _CSV: https://en.wikipedia.org/wiki/Comma-separated_values | ||||||
.. _crate-node-tool: https://crate.io/docs/crate/guide/en/latest/best-practices/crate-node.html | ||||||
.. _`deployment guides` : https://crate.io/docs/crate/howtos/en/latest/deployment/index.html | ||||||
.. _`generate time series data`: https://crate.io/docs/crate/tutorials/en/latest/generate-time-series/index.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence confuses me. What is the intention here? Expressing that there are no benefits of running a cluster of multiple nodes on just 1 node but only downsides?
If so I suggest to rephrase and using shorter sentences instead of one really long confusing one.
Suggestion: