|
| 1 | +# Log records collection in a Kubernetes cluster using FluentBit and YDB |
| 2 | + |
| 3 | +This section presents the implementation of integration between the Kubernetes cluster log shipping tool - FluentBit, with subsequent saving for viewing or analysis in {{ ydb-short-name }}. |
| 4 | + |
| 5 | +## Введение |
| 6 | + |
| 7 | +FluentBit is a tool that can collect text data, manipulate it (change, transform, merge) and send it to various storage facilities for further processing. |
| 8 | + |
| 9 | +To deploy a scheme for delivering logs of running applications to Kubernetes using FluentBit and then saving them in YDB, you need to: |
| 10 | + |
| 11 | +* Create table in YDB |
| 12 | + |
| 13 | +* Configure [FluentBit](https://fluentbit.io) |
| 14 | + |
| 15 | +* Deploy [FluentBit](https://fluentbit.io) in Kubernetes cluster using [HELM](https://helm.sh) |
| 16 | + |
| 17 | +The work diagram looks like this: |
| 18 | + |
| 19 | + |
| 20 | +<small>Figure 1 — Interaction diagram between FluentBit and YDB in the Kubernetes cluster</small> |
| 21 | + |
| 22 | +In this diagram: |
| 23 | + |
| 24 | +* Application pods write logs to stdout/stderr |
| 25 | + |
| 26 | +* Text from stdout/stderr is saved as files on Kubernetes worker nodes |
| 27 | + |
| 28 | +* Pod with FluentBit |
| 29 | + |
| 30 | + * Mounts a folder with log files for itself |
| 31 | + |
| 32 | + * Reads the contents from them |
| 33 | + |
| 34 | + * Enriches posts with additional metadata |
| 35 | + |
| 36 | + * Saves records to YDB cluster |
| 37 | + |
| 38 | +## Creating a table in YDB |
| 39 | + |
| 40 | +On the selected YDB cluster, you need to run the following query: |
| 41 | + |
| 42 | +```sql |
| 43 | +CREATE TABLE `fluent-bit/log` ( |
| 44 | + `timestamp` Timestamp NOT NULL, |
| 45 | + `file` Text NOT NULL, |
| 46 | + `pipe` Text NOT NULL, |
| 47 | + `message` Text NULL, |
| 48 | + `message_parsed` JSON NULL, |
| 49 | + `kubernetes` JSON NULL, |
| 50 | + |
| 51 | + PRIMARY KEY ( |
| 52 | + `timestamp`, `input` |
| 53 | + ) |
| 54 | +) |
| 55 | +``` |
| 56 | + |
| 57 | +Column purpose: |
| 58 | + |
| 59 | +* timestamp – the log timestamp |
| 60 | + |
| 61 | +* file – name of the source from which the log was read. In the case of Kubernetes, this will be the name of the file on the worker node in which the logs of a specific pod are written |
| 62 | + |
| 63 | +* pipe – stdout or stderr stream where application-level writing was done |
| 64 | + |
| 65 | +* message – the log message |
| 66 | + |
| 67 | +* message_parsed – a structured log message, if it could be parsed using the fluent-bit parsers |
| 68 | + |
| 69 | +* kubernetes – information about the pod, for example: name, namespace, logs and annotations |
| 70 | + |
| 71 | +Optionally, you can set TTL for table rows |
| 72 | + |
| 73 | +## FluentBit configuration |
| 74 | + |
| 75 | +It is necessary to replace the repository and image version: |
| 76 | + |
| 77 | +```yaml |
| 78 | +image: |
| 79 | + repository: ghcr.io/ydb-platform/fluent-bit-ydb |
| 80 | + tag: v1.0.0 |
| 81 | +``` |
| 82 | +
|
| 83 | +In this image, a plugin library has been added that implements YDB support. Source code is available [here](https://github.com/ydb-platform/fluent-bit-ydb) |
| 84 | +
|
| 85 | +The following lines define the rules for mounting log folders in FluentBit pods: |
| 86 | +
|
| 87 | +```yaml |
| 88 | +volumeMounts: |
| 89 | + - name: config |
| 90 | + mountPath: /fluent-bit/etc/conf |
| 91 | + |
| 92 | +daemonSetVolumes: |
| 93 | + - name: varlog |
| 94 | + hostPath: |
| 95 | + path: /var/log |
| 96 | + - name: varlibcontainers |
| 97 | + hostPath: |
| 98 | + path: /var/lib/containerd/containers |
| 99 | + - name: etcmachineid |
| 100 | + hostPath: |
| 101 | + path: /etc/machine-id |
| 102 | + type: File |
| 103 | + |
| 104 | +daemonSetVolumeMounts: |
| 105 | + - name: varlog |
| 106 | + mountPath: /var/log |
| 107 | + - name: varlibcontainers |
| 108 | + mountPath: /var/lib/containerd/containers |
| 109 | + readOnly: true |
| 110 | + - name: etcmachineid |
| 111 | + mountPath: /etc/machine-id |
| 112 | + readOnly: true |
| 113 | +``` |
| 114 | +
|
| 115 | +Also, you need to redefine the command and launch arguments: |
| 116 | +
|
| 117 | +```yaml |
| 118 | +command: |
| 119 | + - /fluent-bit/bin/fluent-bit |
| 120 | + |
| 121 | +args: |
| 122 | + - --workdir=/fluent-bit/etc |
| 123 | + - --plugin=/fluent-bit/lib/out_ydb.so |
| 124 | + - --config=/fluent-bit/etc/conf/fluent-bit.conf |
| 125 | +``` |
| 126 | +
|
| 127 | +And the pipeline itself for collecting, converting and delivering logs: |
| 128 | +
|
| 129 | +```yaml |
| 130 | +config: |
| 131 | + inputs: | |
| 132 | + [INPUT] |
| 133 | + Name tail |
| 134 | + Path /var/log/containers/*.log |
| 135 | + multiline.parser cri |
| 136 | + Tag kube.* |
| 137 | + Mem_Buf_Limit 5MB |
| 138 | + Skip_Long_Lines On |
| 139 | +
|
| 140 | + filters: | |
| 141 | + [FILTER] |
| 142 | + Name kubernetes |
| 143 | + Match kube.* |
| 144 | + Keep_Log On |
| 145 | + Merge_Log On |
| 146 | + Merge_Log_Key log_parsed |
| 147 | + K8S-Logging.Parser On |
| 148 | + K8S-Logging.Exclude On |
| 149 | +
|
| 150 | + [FILTER] |
| 151 | + Name modify |
| 152 | + Match kube.* |
| 153 | + Remove time |
| 154 | + Remove _p |
| 155 | +
|
| 156 | + outputs: | |
| 157 | + [OUTPUT] |
| 158 | + Name ydb |
| 159 | + Match kube.* |
| 160 | + TablePath fluent-bit/log |
| 161 | + Columns {".timestamp":"timestamp",".input":"file","log":"message","log_parsed":"message_structured","stream":"pipe","kubernetes":"metadata"} |
| 162 | + ConnectionURL ${OUTPUT_YDB_CONNECTION_URL} |
| 163 | + CredentialsToken ${OUTPUT_YDB_CREDENTIALS_TOKEN} |
| 164 | +``` |
| 165 | +
|
| 166 | +Blocks description: |
| 167 | +
|
| 168 | +* Inputs. This block specifies where to read and how to parse logs. In this case, *.log files will be read from the /var/log/containers/ folder, which was mounted from the host |
| 169 | +
|
| 170 | +* Filters. This block specifies how logs will be processed. In this case: for each log the corresponding metadata will be found (using the kubernetes filter), and unused fields (_p, time) will be cut out |
| 171 | +
|
| 172 | +* Outputs. This block specifies where the logs will be sent. In this case, to the `fluent-bit/log` table in the {{ ydb-short-name }} cluster. Cluster connection parameters (ConnectionURL, CredentialsToken) are set using the corresponding environment variables – `OUTPUT_YDB_CONNECTION_URL`, `OUTPUT_YDB_CREDENTIALS_TOKEN` |
| 173 | + |
| 174 | +Environment variables are defined as follows: |
| 175 | + |
| 176 | +```yaml |
| 177 | +env: |
| 178 | + - name: OUTPUT_YDB_CONNECTION_URL |
| 179 | + value: grpc://ydb-endpoint:2135/path/to/database |
| 180 | + - name: OUTPUT_YDB_CREDENTIALS_TOKEN |
| 181 | + valueFrom: |
| 182 | + secretKeyRef: |
| 183 | + key: token |
| 184 | + name: fluent-bit-ydb-plugin-token |
| 185 | +``` |
| 186 | + |
| 187 | +The secret authorization token must be created in advance in the cluster. For example, using the command: |
| 188 | + |
| 189 | +```sh |
| 190 | +kubectl create secret -n ydb-fluent-bit-integration generic fluent-bit-ydb-plugin-token --from-literal=token=<YDB TOKEN> |
| 191 | +``` |
| 192 | + |
| 193 | +## FluentBit deployment |
| 194 | + |
| 195 | +HELM is a way to package and install applications in a Kubernetes cluster. To deploy FluentBit, you need to add a chart repository using the command: |
| 196 | + |
| 197 | +```sh |
| 198 | +helm repo add fluent https://fluent.github.io/helm-charts |
| 199 | +``` |
| 200 | + |
| 201 | +Installing FluentBit on a Kubernetes cluster is done using the following command: |
| 202 | + |
| 203 | +```sh |
| 204 | +helm upgrade --install fluent-bit fluent/fluent-bit \ |
| 205 | + --version 0.37.1 \ |
| 206 | + --namespace ydb-fluent-bit-integration \ |
| 207 | + --create-namespace \ |
| 208 | + --values values.yaml |
| 209 | +``` |
| 210 | + |
| 211 | +## Verify the installation |
| 212 | + |
| 213 | +Check that fluent-bit has started by reading its logs (there should be no [error] level entries): |
| 214 | + |
| 215 | +```sh |
| 216 | +kubectl logs -n ydb-fluent-bit-integration -l app.kubernetes.io/instance=fluent-bit |
| 217 | +``` |
| 218 | + |
| 219 | +Check that there are records in the YDB table (they will appear approximately a few minutes after launching FluentBit): |
| 220 | + |
| 221 | +```sql |
| 222 | +SELECT * FROM `fluent-bit/log` LIMIT 10 ORDER BY `timestamp` DESC |
| 223 | +``` |
| 224 | + |
| 225 | +## Resource cleanup |
| 226 | + |
| 227 | +It is enough to remove the namespace with fluent-bit: |
| 228 | + |
| 229 | +```sh |
| 230 | +kubectl delete namespace ydb-fluent-bit-integration |
| 231 | +``` |
| 232 | + |
| 233 | +And a table with logs: |
| 234 | + |
| 235 | +```sql |
| 236 | +DROP TABLE `fluent-bit/log` |
| 237 | +``` |
0 commit comments