|
| 1 | +.. _scaling-down: |
| 2 | + |
| 3 | +=========== |
| 4 | +Downscaling |
| 5 | +=========== |
| 6 | + |
| 7 | +In this how-to guide you will: |
| 8 | + |
| 9 | +- Create a three node cluster which will run on a single host. |
| 10 | +- Add some data to it. |
| 11 | +- Scale it down to a single node cluster. |
| 12 | + |
| 13 | +.. _scaling-down-starting-vanilla-cluster: |
| 14 | + |
| 15 | +vanilla cluster |
| 16 | +=============== |
| 17 | + |
| 18 | +Our cluster's name is ``vanilla``, it is composed of three CrateDB nodes, all running |
| 19 | +on the same host, and sharing the file system, operating system's scheduler, memory |
| 20 | +manager, and hardware. |
| 21 | + |
| 22 | +This setup provides a local cluster environment, when you only have one host, at the |
| 23 | +cost of increased latency in writes, because when operating as a cluster, the nodes |
| 24 | +must reach consensus on each write operation. |
| 25 | + |
| 26 | +To set it up I recommend any of our `deployment guides`_, using this template for each |
| 27 | +of the nodes: |
| 28 | + |
| 29 | +:: |
| 30 | + |
| 31 | + cluster.name: vanilla |
| 32 | + node.name: <NODE_NAME> |
| 33 | + network.host: _local_ |
| 34 | + node.max_local_storage_nodes: 3 |
| 35 | + stats.service.interval: 0 |
| 36 | + |
| 37 | + http.cors.enabled: true |
| 38 | + http.cors.allow-origin: "*" |
| 39 | + |
| 40 | + transport.tcp.port: <PORT> |
| 41 | + gateway.expected_nodes: 3 |
| 42 | + gateway.recover_after_nodes: 3 |
| 43 | + discovery.seed_hosts: |
| 44 | + - 127.0.0.1:4301 |
| 45 | + - 127.0.0.1:4302 |
| 46 | + - 127.0.0.1:4303 |
| 47 | + cluster.initial_master_nodes: |
| 48 | + - 127.0.0.1:4301 |
| 49 | + - 127.0.0.1:4302 |
| 50 | + - 127.0.0.1:4303 |
| 51 | + |
| 52 | +where ``<PORT>`` is one of ``[4301, 4302, 4303]``, and ``<NODE_NAME>`` one of ``[n1, n1, n3]``. |
| 53 | + |
| 54 | +The settings come explained in cluster-wide-settings_ and node-specific-settings_. |
| 55 | + |
| 56 | + |
| 57 | +Installing from sources |
| 58 | +======================= |
| 59 | + |
| 60 | +An alternative way of installing CrateDB is from its sources. This script summarizes |
| 61 | +the procedure: |
| 62 | + |
| 63 | +:: |
| 64 | + |
| 65 | + #!/bin/sh |
| 66 | + |
| 67 | + # Forder 'dist' will contain the CrateDB git clone 'crate-clone'. |
| 68 | + # The clone will be used to build a tarball, which will be |
| 69 | + # cracked open. A soft link `crate` will be created to point |
| 70 | + # to the CrateDB distribution's root folder. |
| 71 | + |
| 72 | + if [ ! -d dist ]; then |
| 73 | + mkdir dist |
| 74 | + fi |
| 75 | + |
| 76 | + # get the clone |
| 77 | + if [ ! -d dist/crate-clone ]; then |
| 78 | + git clone https://github.com/crate/crate.git dist/crate-clone |
| 79 | + fi |
| 80 | + cd dist/crate-clone |
| 81 | + |
| 82 | + # delete any old tarballs |
| 83 | + find ./app/build/distributions -name 'crate-*.tar.gz' -exec rm -f {} \; |
| 84 | + |
| 85 | + # build the tarball |
| 86 | + git pull |
| 87 | + ./gradlew clean disTar |
| 88 | + |
| 89 | + # crack open the tarball |
| 90 | + latest_tar_ball=$(find ./app/build/distributions -name 'crate-*.tar.gz') |
| 91 | + cp $latest_tar_ball .. |
| 92 | + cd .. |
| 93 | + name=$(basename $latest_tar_ball) |
| 94 | + tar -xzvf $name |
| 95 | + rm -f $name |
| 96 | + name=${name/.tar.gz/} |
| 97 | + cd .. |
| 98 | + |
| 99 | + # create the symlink to the distribution folder |
| 100 | + rm -f crate |
| 101 | + ln -s dist/$name crate |
| 102 | + echo "Crate $name has been installed." |
| 103 | + |
| 104 | + |
| 105 | +Once CrateDB is installed, you can create a folder ``conf`` at the same level as |
| 106 | +the ``crate`` soft link. This folder should contain a subfolder per node, named |
| 107 | +the same as the node ``[n1, n2, n3]``, where you can place the ``crate.yml`` file, |
| 108 | +as well as the ``log4j2.properties`` logger configuration file (you will find a |
| 109 | +copy inside the ``config`` folder within the CrateDB installation). |
| 110 | + |
| 111 | +To start each node ``[n1, n2, n3]`` you can run a script (once per node) similar to: |
| 112 | + |
| 113 | +:: |
| 114 | + |
| 115 | + node_name=$1 |
| 116 | + |
| 117 | + path_conf="$(pwd)/conf/$node_name" |
| 118 | + path_home=$(pwd)/crate |
| 119 | + path_data=$(pwd)/data/$node_name |
| 120 | + path_snapshots=$(pwd)/data/$node_name/snapshots |
| 121 | + |
| 122 | + export CRATE_HOME=$path_home |
| 123 | + if [ -z "$CRATE_HEAP_SIZE" ]; then |
| 124 | + export CRATE_HEAP_SIZE="8G" |
| 125 | + fi |
| 126 | + |
| 127 | + ./crate/bin/crate -Cpath.conf=$path_conf \ |
| 128 | + -Cpath.data=$path_data \ |
| 129 | + -Cpath.repo=$path_snapshots \ |
| 130 | + -Cnode.name=<NODE_NAME> \ |
| 131 | + -Ctransport.tcp.port=<PORT> |
| 132 | + |
| 133 | + |
| 134 | +which will form the ``vanilla cluster``, electing a master. |
| 135 | + |
| 136 | +You can interact with the cluster by opening a browser and pointing it to |
| 137 | +*http://localhost:4200*, CrateDB's `Admin UI`_. |
| 138 | + |
| 139 | + |
| 140 | +.. _scaling-down-adding-data: |
| 141 | + |
| 142 | +Adding some data |
| 143 | +================ |
| 144 | + |
| 145 | +If you would like to add some data, I recommend to follow the `generate time series data`_ |
| 146 | +tutorial, which will give you more tools and experience with CrateDB. |
| 147 | + |
| 148 | +As an alternative, you can produce a CSV_ file **logs.csv** with a script such as: |
| 149 | + |
| 150 | + :: |
| 151 | + |
| 152 | + import random |
| 153 | + import string |
| 154 | + import ipaddress |
| 155 | + import time |
| 156 | + |
| 157 | + |
| 158 | + # to achieve log lines as in: |
| 159 | + # 2012-01-01T00:00:00Z,25.152.171.147,/crate/Five_Easy_Pieces.html,200,280278 |
| 160 | + # -> timestamp |
| 161 | + # -> random ip address |
| 162 | + # -> random request (a path) |
| 163 | + # -> random status code |
| 164 | + # -> random object size |
| 165 | + |
| 166 | + |
| 167 | + def timestamp_range(start, end, format): |
| 168 | + st = int(time.mktime(time.strptime(start, format))) |
| 169 | + et = int(time.mktime(time.strptime(end, format))) |
| 170 | + dt = 1 # 1 sec |
| 171 | + fmt = lambda x: time.strftime(format, time.localtime(x)) |
| 172 | + return (fmt(x) for x in range(st, et, dt)) |
| 173 | + |
| 174 | + |
| 175 | + def rand_ip(): |
| 176 | + return str(ipaddress.IPv4Address(random.getrandbits(32))) |
| 177 | + |
| 178 | + |
| 179 | + def rand_request(): |
| 180 | + rand = lambda src: src[random.randint(0, len(src) - 1)] |
| 181 | + path = lambda: "/".join((rand(("usr", "bin", "workspace", "temp", "home", "crate"))) for _ in range(4)) |
| 182 | + name = lambda: ''.join(random.sample(string.ascii_lowercase, 7)) |
| 183 | + ext = lambda: rand(("html", "pdf", "log", "gif", "jpeg", "js")) |
| 184 | + return "{}/{}.{}".format(path(), name(), ext()) |
| 185 | + |
| 186 | + |
| 187 | + def rand_object_size(): |
| 188 | + return str(random.randint(0, 1024)) |
| 189 | + |
| 190 | + |
| 191 | + def rand_status_code(): |
| 192 | + return str(random.randint(100, 500)) |
| 193 | + |
| 194 | + |
| 195 | + if __name__ == "__main__": |
| 196 | + print("log_time,client_ip,request,status_code,object_size") |
| 197 | + for ts in timestamp_range("2019-01-01T00:00:00Z", "2019-01-01T01:00:00Z", '%Y-%m-%dT%H:%M:%SZ'): |
| 198 | + print(",".join([ts, rand_ip(), rand_request(), rand_status_code(), rand_object_size()])) |
| 199 | + |
| 200 | + |
| 201 | +which requires the presence of a ``logs`` table. In the `Admin UI`_: |
| 202 | + |
| 203 | + :: |
| 204 | + |
| 205 | + CREATE TABLE logs (log_time timestamp NOT NULL, |
| 206 | + client_ip ip NOT NULL, |
| 207 | + request string NOT NULL, |
| 208 | + status_code short NOT NULL, |
| 209 | + object_size long NOT NULL); |
| 210 | + |
| 211 | + COPY logs FROM 'file:/// logs.csv'; |
| 212 | + REFRESH TABLE logs; |
| 213 | + select * from logs order by log_time limit 10800; |
| 214 | + |
| 215 | +The three nodes perform the copy operation (remember, we are operating as a cluster), |
| 216 | +so you are expecting to see 3600 * 3 rows inserted, of what looks like "repeated" data. |
| 217 | +Because a primary key was not defined, CrateDB created the default *_id* primary |
| 218 | +key for each row, and this was done at each node. The result is that each node inserted |
| 219 | +a row per line in the csv file, with a cluster wide unique default *_id*, and this is |
| 220 | +perceived as a triplication of the data. If you do not want to see triplication, |
| 221 | +you can define a primary key. |
| 222 | + |
| 223 | +.. _scaling-down-exploring-the-data: |
| 224 | + |
| 225 | +Exploring the Data |
| 226 | +================== |
| 227 | + |
| 228 | +Using the `Admin UI`_, shards view on the left: |
| 229 | + |
| 230 | +.. image:: shards-view.png |
| 231 | + |
| 232 | +You can see the three nodes, with each having a number of shards, like so: |
| 233 | + |
| 234 | + +-------+---+---+---+---+---+---+ |
| 235 | + | Shard | 0 | 1 | 2 | 3 | 4 | 5 | |
| 236 | + +=======+===+===+===+===+===+===+ |
| 237 | + | n1 | . | . | . | | . | | |
| 238 | + +-------+---+---+---+---+---+---+ |
| 239 | + | n2 | . | . | | . | | . | |
| 240 | + +-------+---+---+---+---+---+---+ |
| 241 | + | n3 | | | . | . | . | . | |
| 242 | + +-------+---+---+---+---+---+---+ |
| 243 | + |
| 244 | +Thus in this cluster setup, one node can crash; yet the data in the cluster will still |
| 245 | +remain fully available because any two nodes have access to all the shards when they |
| 246 | +work together to fulfill query requests. A SQL table is a composite of shards (six in |
| 247 | +our case). When a query is executed, the planner will define steps for accessing all |
| 248 | +the shards of the table. By adding nodes to the cluster, or shards to the table, or both, |
| 249 | +the data is spread over more nodes, so that the computing is parallelized. |
| 250 | + |
| 251 | +Having a look at the setup for table *logs*: |
| 252 | + |
| 253 | +:: |
| 254 | + |
| 255 | + SHOW CREATE TABLE logs; |
| 256 | + |
| 257 | +Will return: |
| 258 | + |
| 259 | +:: |
| 260 | + |
| 261 | + CREATE TABLE IF NOT EXISTS "doc"."logs" ( |
| 262 | + "log_time" TIMESTAMP WITH TIME ZONE NOT NULL, |
| 263 | + "client_ip" IP NOT NULL, |
| 264 | + "request" TEXT NOT NULL, |
| 265 | + "status_code" SMALLINT NOT NULL, |
| 266 | + "object_size" BIGINT NOT NULL |
| 267 | + ) |
| 268 | + CLUSTERED INTO 6 SHARDS |
| 269 | + WITH ( |
| 270 | + |
| 271 | + number_of_replicas = '0-1', |
| 272 | + |
| 273 | + ) |
| 274 | + |
| 275 | +You have a default min number of replicas of zero, and a max of one for each |
| 276 | +of our six shards. A replica is simply a copy of a shard. |
| 277 | + |
| 278 | + |
| 279 | +.. _scaling-down-downscaling: |
| 280 | + |
| 281 | +Downscaling to a single node cluster |
| 282 | +==================================== |
| 283 | + |
| 284 | +Scaling down to a single node cluster is the extreme example. In general |
| 285 | +downscaling is achieved by making sure the surviving nodes of the cluster have |
| 286 | +access to all the shards, even when the other nodes are missing. |
| 287 | + |
| 288 | +The first step is to ensure that the number of replicas matches the number of |
| 289 | +nodes, so that all nodes have access to all the shards: |
| 290 | + |
| 291 | +:: |
| 292 | + |
| 293 | + ALTER TABLE logs SET (number_of_replicas = '1-all'); |
| 294 | + |
| 295 | +In the `Admin UI`_, you can follow the progress of replication. |
| 296 | + |
| 297 | +When replication is completed, you need to take down all the nodes in the cluster, |
| 298 | +as you are going to externally affect is state by means of the crate-node-tool_. You |
| 299 | +will first detach nodes [n2, n3], and then will bootstrap node [n1]. For convenience |
| 300 | +here is how each operation is invoked from the command line: |
| 301 | + |
| 302 | +:: |
| 303 | + |
| 304 | + ./crate/bin/crate-node detach-cluster -Cpath.home=$(pwd)/crate \ |
| 305 | + -Cpath.conf=$path_conf \ |
| 306 | + -Cpath.data=$path_data \ |
| 307 | + -Cpath.repo=$path_snapshots \ |
| 308 | + -Cnode.name=<NODE_NAME> \ |
| 309 | + -Ctransport.tcp.port=<PORT> |
| 310 | + |
| 311 | + |
| 312 | + ./crate/bin/crate-node unsafe-bootstrap -Cpath.home=$(pwd)/crate \ |
| 313 | + -Cpath.conf=$path_conf \ |
| 314 | + -Cpath.data=$path_data \ |
| 315 | + -Cpath.repo=$path_snapshots \ |
| 316 | + -Cnode.name=<NODE_NAME> \ |
| 317 | + -Ctransport.tcp.port=<PORT> |
| 318 | + |
| 319 | + |
| 320 | +The best practice is to select the node that was master in the cluster, to be the single |
| 321 | +node in the new cluster, as then we know it had the latest version of the cluster state. |
| 322 | +For this tutorial, we are running in a single host so the cluster state is more or less |
| 323 | +guaranteed to be consistent across all nodes. In principle, however, the cluster could |
| 324 | +be running across multiple hosts, and then we would want the master node to become the |
| 325 | +new single node cluster. |
| 326 | + |
| 327 | +The new configuration for the single node: |
| 328 | + |
| 329 | + :: |
| 330 | + |
| 331 | + cluster.name: simple # don't really need to change this |
| 332 | + node.name: n1 |
| 333 | + stats.service.interval: 0 |
| 334 | + network.host: _local_ |
| 335 | + node.max_local_storage_nodes: 1 |
| 336 | + |
| 337 | + http.cors.enabled: true |
| 338 | + http.cors.allow-origin: "*" |
| 339 | + |
| 340 | + transport.tcp.port: 4301 |
| 341 | + |
| 342 | + |
| 343 | +Now you can start **n1**. The new single node cluster forms and transitions to *[YELLOW]* |
| 344 | +state. We sort that out with: |
| 345 | + |
| 346 | + :: |
| 347 | + |
| 348 | + ALTER TABLE logs SET (number_of_replicas = '0-1'); |
| 349 | + |
| 350 | + |
| 351 | +.. _crate-howtos: https://github.com/crate/crate-howtos |
| 352 | +.. _GitHub: https://github.com/crate/crate.git |
| 353 | +.. _cluster-wide-settings: https://crate.io/docs/crate/reference/en/latest/config/cluster.html |
| 354 | +.. _node-specific-settings: https://crate.io/docs/crate/reference/en/latest/config/node.html |
| 355 | +.. _`Admin UI`: http://localhost:4200 |
| 356 | +.. _crate-node: https://crate.io/docs/crate/reference/en/latest/cli-tools.html#cli-crate-node |
| 357 | +.. _CSV: https://en.wikipedia.org/wiki/Comma-separated_values |
| 358 | +.. _crate-node-tool: https://crate.io/docs/crate/guide/en/latest/best-practices/crate-node.html |
| 359 | +.. _`deployment guides` : https://crate.io/docs/crate/howtos/en/latest/deployment/index.html |
| 360 | +.. _`generate time series data`: https://crate.io/docs/crate/tutorials/en/latest/generate-time-series/index.html |
0 commit comments