cockroachdb · Apr 10, 2025
diff --git a/Diff for: ‎src/current/_includes/cockroachcloud/metrics-usage/sql.conn.latency.md
+1-3 b/Diff for: ‎src/current/_includes/cockroachcloud/metrics-usage/sql.conn.latency.md
+1-3
diff --git a/Diff for: ‎src/current/_includes/cockroachcloud/metrics-usage/sql.conns.md
+1-5 b/Diff for: ‎src/current/_includes/cockroachcloud/metrics-usage/sql.conns.md
+1-5
diff --git a/Diff for: ‎src/current/_includes/v23.1/essential-metrics.md
+2-1 b/Diff for: ‎src/current/_includes/v23.1/essential-metrics.md
+2-1
diff --git a/Diff for: ‎src/current/_includes/v23.2/essential-metrics.md
+2-1 b/Diff for: ‎src/current/_includes/v23.2/essential-metrics.md
+2-1
diff --git a/Diff for: ‎src/current/_includes/v24.1/essential-metrics.md
+2-1 b/Diff for: ‎src/current/_includes/v24.1/essential-metrics.md
+2-1
@@ -1,3 +1 @@
-Connection latency is calculated as the time in nanoseconds between when the cluster receives a connection request and establishes the connection to the client, including [authentication]({% link cockroachcloud/authentication.md %}). This graph shows the p90 and p99 latencies for [SQL connections]({% link {{ site.current_cloud_version }}/show-sessions.md %}) to the cluster.
-
-These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times.
+Connection latency is calculated as the time in nanoseconds between when the cluster receives a connection request and establishes the connection to the client, including [authentication]({% link cockroachcloud/authentication.md %}). This graph shows the p90 and p99 latencies for [SQL connections]({% link {{ site.current_cloud_version }}/show-sessions.md %}) to the cluster.<br /><br />These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times. Connection failures are not recorded in these metrics.
@@ -1,5 +1 @@
-This metric shows the total number of SQL [client connections]({% link {{ site.current_cloud_version }}/show-sessions.md %}) across the cluster.
-
-Refer to the [<b>Sessions</b> page]({% link cockroachcloud/sessions-page.md %}) for more details on the sessions.
-
-This metric also shows the distribution, or balancing, of connections across the cluster. Review [Connection Pooling]({% link {{ site.current_cloud_version }}/connection-pooling.md %}).
+This metric shows the total number of SQL [client connections]({% link {{ site.current_cloud_version }}/show-sessions.md %}) across the cluster.<br /><br />Refer to the [<b>Sessions</b> page]({% link cockroachcloud/sessions-page.md %}) for more details on the sessions.<br /><br />This metric also shows the distribution, or balancing, of connections across the cluster. Review [Connection Pooling]({% link {{ site.current_cloud_version }}/connection-pooling.md %}).
@@ -111,7 +111,8 @@ The **Usage** column explains why each metric is important to visualize in a cus
 | sql.txn.latency-p90, sql.txn.latency-p99              | sql.txn.latency                                              | Latency of SQL transactions                                  | These high-level metrics provide a latency histogram of all executed SQL transactions. These metrics provide an overview of the current SQL workload. |
 | txnwaitqueue.deadlocks_total                          | {% if include.deployment == 'self-hosted' %}txnwaitqueue.deadlocks.count |{% elsif include.deployment == 'advanced' %}NOT AVAILABLE |{% endif %} Number of deadlocks detected by the transaction wait queue   | Alert on this metric if its value is greater than zero, especially if transaction throughput is lower than expected. Applications should be able to detect and recover from deadlock errors. However, transaction performance and throughput can be maximized if the application logic avoids deadlock conditions in the first place, for example, by keeping transactions as short as possible. |
 | sql.distsql.contended_queries.count                   | {% if include.deployment == 'self-hosted' %}sql.distsql.contended.queries |{% elsif include.deployment == 'advanced' %} sql.distsql.contended.queries |{% endif %} Number of SQL queries that experienced contention            | This metric is incremented whenever there is a non-trivial amount of contention experienced by a statement whether read-write or write-write conflicts. Monitor this metric to correlate possible workload performance issues to contention conflicts. |
-| sql.conn.latency-p90, sql.conn.latency-p99            | sql.conn.latency                                             | Latency to establish and authenticate a SQL connection       | These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times. |
+| <a id="sql-conn-failures"></a>sql.conn.failures            | sql.conn.failures.count                                             | Number of SQL connection failures       | This metric is incremented whenever a connection attempt fails for any reason, including timeouts. |
+| <a id="sql-conn-latency"></a>sql.conn.latency-p90, sql.conn.latency-p99            | sql.conn.latency                                             | Latency to establish and authenticate a SQL connection       | These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times. Connection failures are not recorded in these metrics.|
 | txn.restarts.serializable                             | txn.restarts.serializable                                    | Number of restarts due to a forwarded commit timestamp and isolation=SERIALIZABLE | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
 | txn.restarts.writetooold                              | txn.restarts.writetooold                                     | Number of restarts due to a concurrent writer committing first | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
 | txn.restarts.writetoooldmulti                         | {% if include.deployment == 'self-hosted' %}txn.restarts.writetoooldmulti.count |{% elsif include.deployment == 'advanced' %}NOT AVAILABLE |{% endif %} Number of restarts due to multiple concurrent writers committing first | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
 
@@ -111,7 +111,8 @@ The **Usage** column explains why each metric is important to visualize in a cus
 | sql.txn.latency-p90, sql.txn.latency-p99              | sql.txn.latency                                              | Latency of SQL transactions                                  | These high-level metrics provide a latency histogram of all executed SQL transactions. These metrics provide an overview of the current SQL workload. |
 | txnwaitqueue.deadlocks_total                          | {% if include.deployment == 'self-hosted' %}txnwaitqueue.deadlocks.count |{% elsif include.deployment == 'advanced' %}NOT AVAILABLE |{% endif %} Number of deadlocks detected by the transaction wait queue   | Alert on this metric if its value is greater than zero, especially if transaction throughput is lower than expected. Applications should be able to detect and recover from deadlock errors. However, transaction performance and throughput can be maximized if the application logic avoids deadlock conditions in the first place, for example, by keeping transactions as short as possible. |
 | sql.distsql.contended_queries.count                   | {% if include.deployment == 'self-hosted' %}sql.distsql.contended.queries |{% elsif include.deployment == 'advanced' %} sql.distsql.contended.queries |{% endif %} Number of SQL queries that experienced contention            | This metric is incremented whenever there is a non-trivial amount of contention experienced by a statement whether read-write or write-write conflicts. Monitor this metric to correlate possible workload performance issues to contention conflicts. |
-| sql.conn.latency-p90, sql.conn.latency-p99            | sql.conn.latency                                             | Latency to establish and authenticate a SQL connection       | These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times. |
+| <a id="sql-conn-failures"></a>sql.conn.failures            | sql.conn.failures.count                                             | Number of SQL connection failures       | This metric is incremented whenever a connection attempt fails for any reason, including timeouts. |
+| <a id="sql-conn-latency"></a>sql.conn.latency-p90, sql.conn.latency-p99            | sql.conn.latency                                             | Latency to establish and authenticate a SQL connection       | These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times. Connection failures are not recorded in these metrics.|
 | txn.restarts.serializable                             | txn.restarts.serializable                                    | Number of restarts due to a forwarded commit timestamp and isolation=SERIALIZABLE | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
 | txn.restarts.writetooold                              | txn.restarts.writetooold                                     | Number of restarts due to a concurrent writer committing first | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
 | txn.restarts.writetoooldmulti                         | {% if include.deployment == 'self-hosted' %}txn.restarts.writetoooldmulti.count |{% elsif include.deployment == 'advanced' %}NOT AVAILABLE |{% endif %} Number of restarts due to multiple concurrent writers committing first | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
 
@@ -125,7 +125,8 @@ The **Usage** column explains why each metric is important to visualize in a cus
 | sql.txn.latency-p90, sql.txn.latency-p99              | sql.txn.latency                                              | Latency of SQL transactions                                  | These high-level metrics provide a latency histogram of all executed SQL transactions. These metrics provide an overview of the current SQL workload. |
 | txnwaitqueue.deadlocks_total                          | {% if include.deployment == 'self-hosted' %}txnwaitqueue.deadlocks.count |{% elsif include.deployment == 'advanced' %}NOT AVAILABLE |{% endif %} Number of deadlocks detected by the transaction wait queue   | Alert on this metric if its value is greater than zero, especially if transaction throughput is lower than expected. Applications should be able to detect and recover from deadlock errors. However, transaction performance and throughput can be maximized if the application logic avoids deadlock conditions in the first place, for example, by keeping transactions as short as possible. |
 | sql.distsql.contended_queries.count                   | {% if include.deployment == 'self-hosted' %}sql.distsql.contended.queries |{% elsif include.deployment == 'advanced' %} sql.distsql.contended.queries |{% endif %} Number of SQL queries that experienced contention            | This metric is incremented whenever there is a non-trivial amount of contention experienced by a statement whether read-write or write-write conflicts. Monitor this metric to correlate possible workload performance issues to contention conflicts. |
-| <a id="sql-conn-latency"></a>sql.conn.latency-p90, sql.conn.latency-p99            | sql.conn.latency                                             | Latency to establish and authenticate a SQL connection       | These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times. |
+| <a id="sql-conn-failures"></a>sql.conn.failures            | sql.conn.failures.count                                             | Number of SQL connection failures       | This metric is incremented whenever a connection attempt fails for any reason, including timeouts. |
+| <a id="sql-conn-latency"></a>sql.conn.latency-p90, sql.conn.latency-p99            | sql.conn.latency                                             | Latency to establish and authenticate a SQL connection       | These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times. Connection failures are not recorded in these metrics.|
 | txn.restarts.serializable                             | txn.restarts.serializable                                    | Number of restarts due to a forwarded commit timestamp and isolation=SERIALIZABLE | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
 | txn.restarts.writetooold                              | txn.restarts.writetooold                                     | Number of restarts due to a concurrent writer committing first | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
 | txn.restarts.writetoooldmulti                         | {% if include.deployment == 'self-hosted' %}txn.restarts.writetoooldmulti.count |{% elsif include.deployment == 'advanced' %}NOT AVAILABLE |{% endif %} Number of restarts due to multiple concurrent writers committing first | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |