You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
storage,sql: fix naming nits in mz_source_statistics
To comport with the system catalog style guide:
* Change `rehydration_latency_ms uint8` to `rehydration_latency
interval`, since interval types are preferred to integers where the
units are indicated in the column name's suffix.
This rule was undocumented, so also add it to the style guide.
* Rename `envelope_state_count` to `envelope_state_records`. This one
is more subjective, but "records" aligns with names used in other
relations, like `mz_dataflow_arrangement_sizes.records` and
`mz_records_per_dataflow`.
Copy file name to clipboardExpand all lines: doc/developer/style.md
+2-1
Original file line number
Diff line number
Diff line change
@@ -128,11 +128,12 @@ We adhere to standards for our system catalog relations (tables, views), which i
128
128
Modeling standards:
129
129
* Normalize the schema for tables. If you’re adding a table that adds detail to rows in an existing table, refer to those rows by ID, and don’t duplicate columns that already exist. E.g., the `mz_kafka_sources` table does not include the name of the source, since that information is available in the `mz_sources` table.
130
130
* Remember, Materialize is good at joins! We can always add syntax sugar via a `SHOW` command or a view to spare users from typing out the joins for common queries.
131
+
* Use the `interval` type to represent durations. Do not use integers where the unit is indicated as a suffix on the column name. E.g., use `startup_time interval`, not `startup_time_ms integer`. The only exception is durations with nanosecond precision. Since the `interval` type only has millisecond precision, it is acceptable to use `<$name>_ns integer` when necessary (e.g., `delay_ns`).
131
132
132
133
Naming standards:
133
134
* Catalog relation names should be consistent with the user-facing naming and messaging in our docs. The names should not reference internal-only concepts when possible.
134
135
* Avoid all but the most common abbreviations. Say `position` instead of `pos`. Say `return` instead of `ret`. Say `definition` instead of `def`.
135
-
* We allow three abbreviations at present: `id`, `oid`, and `ip`.
136
+
* We allow four abbreviations at present: `id`, `oid`, `ip`, and `url`.
136
137
* Use `kebab-case` for enum values. E.g., the `type` of a Confluent Schema Registry connection is `confluent-schema-registry` and the `type` of a materialized view is `materialized-view`. Only use hyphens to separate multiple words. Don’t introduce hyphens for CamelCased proper nouns. For example, the “AWS PrivateLink” connection is represented as `aws-privatelink`.
137
138
* Name timestamp fields with an `_at` suffix, e.g., `occurred_at`.
138
139
* Do not name boolean fields with an `is_` prefix. E.g., say `indexed`, not `is_indexed`.
|`id`|[`text`]| The ID of the source. Corresponds to [`mz_catalog.mz_sources.id`](../mz_catalog#mz_sources). |
692
-
|`worker_id`|[`uint8`]| The ID of the worker thread. |
693
-
|`snapshot_committed`|[`boolean`]| Whether the worker has committed the initial snapshot for a source. |
694
-
|`messages_received`|[`uint8`]| The number of messages the worker has received from the external system. Messages are counted in a source type-specific manner. Messages do not correspond directly to updates: some messages produce multiple updates, while other messages may be coalesced into a single update. |
695
-
|`updates_staged`|[`uint8`]| The number of updates (insertions plus deletions) the worker has written but not yet committed to the storage layer. |
696
-
|`updates_committed`|[`uint8`]| The number of updates (insertions plus deletions) the worker has committed to the storage layer. |
697
-
|`bytes_received`|[`uint8`]| The number of bytes the worker has read from the external system. Bytes are counted in a source type-specific manner and may or may not include protocol overhead. |
698
-
|`envelope_state_bytes`|[`uint8`]| The number of bytes stored in the source envelope state. |
699
-
|`envelope_state_count`|[`uint8`]| The number of individual records stored in the source envelope state. |
700
-
|`rehydration_latency_ms`|[`uint8`]| The amount of time in milliseconds it took for the worker to rehydrate the source envelope state. |
691
+
|`id`|[`text`]| The ID of the source. Corresponds to [`mz_catalog.mz_sources.id`](../mz_catalog#mz_sources). |
692
+
|`worker_id`|[`uint8`]| The ID of the worker thread. |
693
+
|`snapshot_committed`|[`boolean`]| Whether the worker has committed the initial snapshot for a source. |
694
+
|`messages_received`|[`uint8`]| The number of messages the worker has received from the external system. Messages are counted in a source type-specific manner. Messages do not correspond directly to updates: some messages produce multiple updates, while other messages may be coalesced into a single update. |
695
+
|`updates_staged`|[`uint8`]| The number of updates (insertions plus deletions) the worker has written but not yet committed to the storage layer. |
696
+
|`updates_committed`|[`uint8`]| The number of updates (insertions plus deletions) the worker has committed to the storage layer. |
697
+
|`bytes_received`|[`uint8`]| The number of bytes the worker has read from the external system. Bytes are counted in a source type-specific manner and may or may not include protocol overhead. |
698
+
|`envelope_state_bytes`|[`uint8`]| The number of bytes stored in the source envelope state. |
699
+
|`envelope_state_records`|[`uint8`]| The number of individual records stored in the source envelope state. |
700
+
|`rehydration_latency`|[`interval`]| The amount of time it took for the worker to rehydrate the source envelope state. |
0 commit comments