From f924b0d7a2bb4a8f98e1d28078c6cdafaa86f23d Mon Sep 17 00:00:00 2001 From: Andreja Tonev Date: Fri, 18 Apr 2025 17:52:58 +0200 Subject: [PATCH 1/3] Expanding docs regardin the storage access New file: storage-access.mdx Modified python docs and transacctiona docs --- pages/client-libraries/python.mdx | 16 +++++- pages/fundamentals/_meta.ts | 1 + pages/fundamentals/storage-access.mdx | 59 +++++++++++++++++++++++ pages/help-center/errors/transactions.mdx | 32 ++++++++++-- 4 files changed, 102 insertions(+), 6 deletions(-) create mode 100644 pages/fundamentals/storage-access.mdx diff --git a/pages/client-libraries/python.mdx b/pages/client-libraries/python.mdx index fbe40f50a..7db73b206 100644 --- a/pages/client-libraries/python.mdx +++ b/pages/client-libraries/python.mdx @@ -435,7 +435,7 @@ for record in records: print(path.end_node) ``` -Path will contain [Nodes](#process-the-node-result) and [Relationships[#process-the-relationship-result], that can be accessed in the same way as in the previous examples. +Path will contain [Nodes](#process-the-node-result) and [Relationships](#process-the-relationship-result), that can be accessed in the same way as in the previous examples. ### Transaction management @@ -452,6 +452,9 @@ In v2.10, Memgraph added the [multi-tenant support](/database-management/multi-t The `execute_query()` procedure automatically creates a transaction that can include multiple Cypher statements as a single query. If the transaction fails, the procedure will automatically rerun it. +As of Memgraph version 3.2, queries are categorized as read or write and the corresponding storage access is taken. This allows for better query parallelization and higher throughput. +Memgraph will automatically deduce the appropriate access type when executing single queries using `execute_query()`. + Bolt protocol specifies additional [metadata](/database-management/query-metadata) that can be sent along with the requested results. Metadata can be divided into two groups: query statistics and notifications. The query statistics metadata provides query counters that indicate the changes that the **write query** triggered on the server. @@ -545,6 +548,9 @@ With sessions, you can run: ##### Managed transactions To create a managed transaction, use `Session.execute_read()` procedure for read queries and `Session.execute_write()` procedure for write queries. +As of Memgraph version 3.2, queries are categorized as read or write and the corresponding storage access is taken. This allows for better query parallelization and higher throughput. +An exception will be thrown if the user tries to execute a write query inside a read transaction. See [transaction accessor misalignment](/fundamentals/transactions#transaction-accessor-misalignment) for more details. + ```python def match_user(tx, name): @@ -581,6 +587,10 @@ To maintain multiple concurrent transactions, use [multiple concurrent sessions] With explicit transactions, you can get **complete control over transactions**. To begin a transaction, run `Session.begin_transaction()` procedure and to run a transaction, use `Transaction.run()` procedure. Explicit transactions offer the possibility of explicitly controlling the end of a transaction with `Transaction.commit()`, `Transaction.rollback()` or `Transaction.close()` methods. +As of Memgraph version 3.2, queries are categorized as read or write and the corresponding storage access is taken. This allows for better query parallelization and higher throughput. +Explicit transactions can cover a number of individual queries, but storage access is given at the start. For best performance, the user needs to declare whether the transaction should use read or write access. +This can be done by setting the session's `default_access_mode` to `"r"` or `"w"`. This will in turn set the access mode of a transaction created via the `begin_transaction` function. Note that `execute_read` and `execute_write` will override the session's default access. + Use explicit transaction if you need to **distribute Cypher execution across multiple functions for the same transaction** or if you need to **run multiple queries within a single transactions without automatic retries**. The following example shows how to explicitly control the transaction of changing account balances based on a token transfer: @@ -610,7 +620,7 @@ def create_users(client, sender, receiver): def transfer_tokens(client, sender_id, receiver_id, num_of_tokens): - with client.session(database="memgraph") as session: + with client.session(database="memgraph", default_access_mode="w") as session: tx = session.begin_transaction() try: @@ -675,6 +685,8 @@ In the above example, if John's account balance is changed to a number less than Implicit or auto-commit transactions are the simplest way to run a Cypher query since they won't be automatically retried as with `execute_query()` procedure or managed transactions. With implicit transactions, you don't have the same control of transaction as with explicit transactions, so they are mostly used for quick prototyping. +As of Memgraph version 3.2, queries are categorized as read or write and the corresponding storage access is taken. This allows for better query parallelization and higher throughput. +Access mode is automatically determined when executing single queries through implicit transactions. To run an implicit transaction, use the `Session.run()` method: diff --git a/pages/fundamentals/_meta.ts b/pages/fundamentals/_meta.ts index a711ae12a..fa851fd40 100644 --- a/pages/fundamentals/_meta.ts +++ b/pages/fundamentals/_meta.ts @@ -3,6 +3,7 @@ export default { "data-types": "Data types", "data-durability": "Data durability", "indexes": "Indexes", + "storage-access" : "Storage access", "storage-memory-usage": "Storage memory usage", "telemetry": "Telemetry", "transactions": "Transactions", diff --git a/pages/fundamentals/storage-access.mdx b/pages/fundamentals/storage-access.mdx new file mode 100644 index 000000000..5b4055c46 --- /dev/null +++ b/pages/fundamentals/storage-access.mdx @@ -0,0 +1,59 @@ +--- +title: Storage access +description: Understand how Memgraph access the storage layer. A detailed resource to optimize multi-client throughput. +--- + +import { Callout } from 'nextra/components' + +# Storage access + +The storage (or storage layer) refers to all data associated with the graph itself. +This means vertices, edges, their properties, label and other data. + +Queries that are reading or writing to the graph are said to be accessing the storage. +These accesses are mediated through the `storage accessors`. +The accessors guarantee a transactional view and concurrency safety. + +## Storage accessors + +There are 3 types of accessors: +- **Shared access**: Allows multiple queries to run in parallel, marked as either read or write. +- **Read-only access**: Permits multiple read queries to run in parallel but forbids any write operations or queries requiring unique access. +- **Unique access**: Grants exclusive access to a single query, preventing any other type of access during its execution. + +**Shared access** is the most common access granted. Any data oriented Cypher query will use it. + +**Read-only access** is currently used only by `CREATE SNAPSHOT` when in ANALYTICAL mode. Using the read-only access guarantees that the snapshot is consistent, while also allowing for other (shared access) read queries to run in parallel. + +**Unique access** queries is used by queries that require full control over the storage layer. These are: +- Index queries +- Constraint queries +- TTL setup queries +- Enum setup queries +- `DROP GRAPH` query +- `RECOVER SNAPSHOT` query + +### Deducing the accessor type needed + +The type is deduced at parsing time automatically. +Read-only and unique accesses are given based on the query type (as described in the previous section). +The shared access needs to additionally mark a query as read or write. This is also done automatically at parse-time. + +The only instance where the user needs to explicitly specify the desired shared access type is when creating a managed (explicit) transactions. +These transactions acquire and hold the storage accessor at the start of their execution. +By default a write shared access is taken, but this can limit which queries can run in parallel. For the best performance, it is recommended to mark transactions with the desired access type. +For more details, refer to [Transactions](/fundamentals/transactions). + +## Queries that do not require storage access + +Queries that do not read or modify any graph data do not need storage access. +These queries are: +- Auth queries +- Multi-tenant queries +- Replication queries +- Show config queries +- Setting queries +- Version queries +- Transaction queue queries + +Please note that even if these queries do not access the storage, they still might be accessing a shared resource and could block or throw if called in parallel. diff --git a/pages/help-center/errors/transactions.mdx b/pages/help-center/errors/transactions.mdx index 6e2a68188..6e97bf6fc 100644 --- a/pages/help-center/errors/transactions.mdx +++ b/pages/help-center/errors/transactions.mdx @@ -99,6 +99,28 @@ While some client drivers may handle serialization errors by retrying transactio developers should not rely solely on this mechanism. Always include comprehensive error handling in your application to address cases where the error persists beyond the retry logic. +## Transaction accessor misalignment + +### Error message + +1. **Accessor type {} and query type {} are misaligned!** + +### Handling transaction timeout + +Transactions in Memgraph must acquire the appropriate type of storage access at the start of their execution. + +This access can be one of the following types: +- **Shared access**: Allows multiple queries to run in parallel, marked as either read or write. +- **Read-only access**: Permits multiple read queries to run in parallel but forbids any write operations or queries requiring unique access. +- **Unique access**: Grants exclusive access to a single query, preventing any other type of access during its execution. + +For more information regarding storage access, please refer to [Storage access](/fundamentals/storage-access). + +While single queries can be parsed and the correct type of storage access can be determined automatically by Memgraph, this is not the case for explicit (managed) transactions. +In managed transactions, the database cannot infer the required access type in advance because the transaction's operations are not know at the beginning. +This can lead to storage access misalignment if the requested access type does not match the operations being performed. + +See appropriate driver's documentation for more information on how to define transaction's type. ## Transaction timeout @@ -119,15 +141,17 @@ Here are the [instructions](/configuration/configuration-settings#using-flags-an Here are the storage access error messages you might encounter: -1. **Cannot access storage, unique access query is running. Try again later.** +1. **Cannot get shared access storage. Try stopping other queries that are running in parallel.** 2. **Cannot get unique access to the storage. Try stopping other queries that are running in parallel.** +3. **Cannot get read only access to the storage. Try stopping other queries that are running in parallel.** ### Understanding storage access timeout -Storage access timeouts occur during query preparation when the query execution engine cannot get the required type of access to the storage. There are two types of storage access: +Storage access timeouts occur during query preparation when the query execution engine cannot get the required type of access to the storage. There are three types of storage access: -- **Shared access**: Multiple queries can have shared access at the same time, but shared access cannot be granted while a query with unique access is running. -- **Unique access**: Only one query can have unique access at a time, and no other query can have any type of access during that period. +- **Shared access**: Multiple queries can have shared access at the same time. These queries are marked with a read or write type, allowing Memgraph to efficiently execute multiple operations in parallel without conflicts. +- **Unique access**: Only one query can have unique access at a time, and no other access type can be granted during that period. +- **Read-only access**: Queries with read-only access allow other read queries to run in parallel but forbid any write operations or unique access queries. These timeouts prevent worker starvation and database blocking that could occur if queries were to wait indefinitely for storage access. From 1c02716482b017f7a38699465b07f7315016134e Mon Sep 17 00:00:00 2001 From: Andreja Tonev Date: Tue, 22 Apr 2025 14:10:03 +0200 Subject: [PATCH 2/3] PR comments --- pages/client-libraries/python.mdx | 20 ++++++++++++-------- pages/fundamentals/storage-access.mdx | 22 ++++++++++------------ pages/help-center/errors/transactions.mdx | 2 +- 3 files changed, 23 insertions(+), 21 deletions(-) diff --git a/pages/client-libraries/python.mdx b/pages/client-libraries/python.mdx index 7db73b206..eb73661c9 100644 --- a/pages/client-libraries/python.mdx +++ b/pages/client-libraries/python.mdx @@ -452,9 +452,6 @@ In v2.10, Memgraph added the [multi-tenant support](/database-management/multi-t The `execute_query()` procedure automatically creates a transaction that can include multiple Cypher statements as a single query. If the transaction fails, the procedure will automatically rerun it. -As of Memgraph version 3.2, queries are categorized as read or write and the corresponding storage access is taken. This allows for better query parallelization and higher throughput. -Memgraph will automatically deduce the appropriate access type when executing single queries using `execute_query()`. - Bolt protocol specifies additional [metadata](/database-management/query-metadata) that can be sent along with the requested results. Metadata can be divided into two groups: query statistics and notifications. The query statistics metadata provides query counters that indicate the changes that the **write query** triggered on the server. @@ -548,9 +545,11 @@ With sessions, you can run: ##### Managed transactions To create a managed transaction, use `Session.execute_read()` procedure for read queries and `Session.execute_write()` procedure for write queries. -As of Memgraph version 3.2, queries are categorized as read or write and the corresponding storage access is taken. This allows for better query parallelization and higher throughput. -An exception will be thrown if the user tries to execute a write query inside a read transaction. See [transaction accessor misalignment](/fundamentals/transactions#transaction-accessor-misalignment) for more details. + +As of Memgraph version 3.2, queries are categorized as **read** or **write** and the corresponding storage access is taken. This allows for better query parallelization and higher throughput. +An exception will be thrown if the user tries to execute a write query inside a read transaction. For more details, see [transaction accessor misalignment](/fundamentals/transactions#transaction-accessor-misalignment). + ```python def match_user(tx, name): @@ -587,9 +586,11 @@ To maintain multiple concurrent transactions, use [multiple concurrent sessions] With explicit transactions, you can get **complete control over transactions**. To begin a transaction, run `Session.begin_transaction()` procedure and to run a transaction, use `Transaction.run()` procedure. Explicit transactions offer the possibility of explicitly controlling the end of a transaction with `Transaction.commit()`, `Transaction.rollback()` or `Transaction.close()` methods. -As of Memgraph version 3.2, queries are categorized as read or write and the corresponding storage access is taken. This allows for better query parallelization and higher throughput. + +As of Memgraph version 3.2, queries are categorized as **read** or **write** and the corresponding storage access is taken. This allows for better query parallelization and higher throughput. Explicit transactions can cover a number of individual queries, but storage access is given at the start. For best performance, the user needs to declare whether the transaction should use read or write access. -This can be done by setting the session's `default_access_mode` to `"r"` or `"w"`. This will in turn set the access mode of a transaction created via the `begin_transaction` function. Note that `execute_read` and `execute_write` will override the session's default access. +This can be done by setting the session's `default_access_mode` to `"r"` or `"w"`. This will, in turn, set the access mode of a transaction created via the `begin_transaction` function. Note that `execute_read` and `execute_write` will override the session's default access. + Use explicit transaction if you need to **distribute Cypher execution across multiple functions for the same transaction** or if you need to **run multiple queries within a single transactions without automatic retries**. @@ -685,8 +686,11 @@ In the above example, if John's account balance is changed to a number less than Implicit or auto-commit transactions are the simplest way to run a Cypher query since they won't be automatically retried as with `execute_query()` procedure or managed transactions. With implicit transactions, you don't have the same control of transaction as with explicit transactions, so they are mostly used for quick prototyping. -As of Memgraph version 3.2, queries are categorized as read or write and the corresponding storage access is taken. This allows for better query parallelization and higher throughput. + + +As of Memgraph version 3.2, queries are categorized as **read** or **write** and the corresponding storage access is taken. This allows for better query parallelization and higher throughput. Access mode is automatically determined when executing single queries through implicit transactions. + To run an implicit transaction, use the `Session.run()` method: diff --git a/pages/fundamentals/storage-access.mdx b/pages/fundamentals/storage-access.mdx index 5b4055c46..9b5094fd3 100644 --- a/pages/fundamentals/storage-access.mdx +++ b/pages/fundamentals/storage-access.mdx @@ -8,11 +8,9 @@ import { Callout } from 'nextra/components' # Storage access The storage (or storage layer) refers to all data associated with the graph itself. -This means vertices, edges, their properties, label and other data. +This means vertices, edges, their properties, labels, and other data. -Queries that are reading or writing to the graph are said to be accessing the storage. -These accesses are mediated through the `storage accessors`. -The accessors guarantee a transactional view and concurrency safety. +Queries that read from or write to the graph are considered to be accessing the storage. These storage accesses are managed through `storage accessors`, which guarantee a transactional view and maintain concurrency safety. ## Storage accessors @@ -23,15 +21,15 @@ There are 3 types of accessors: **Shared access** is the most common access granted. Any data oriented Cypher query will use it. -**Read-only access** is currently used only by `CREATE SNAPSHOT` when in ANALYTICAL mode. Using the read-only access guarantees that the snapshot is consistent, while also allowing for other (shared access) read queries to run in parallel. +**Read-only access** is currently used only by `CREATE SNAPSHOT` when in `IN_MEMORY_ANALYTICAL` storage mode. Using the read-only access guarantees that the snapshot is consistent, while also allowing for other (shared access) read queries to run in parallel. -**Unique access** queries is used by queries that require full control over the storage layer. These are: -- Index queries -- Constraint queries -- TTL setup queries -- Enum setup queries -- `DROP GRAPH` query -- `RECOVER SNAPSHOT` query +**Unique access** queries are used by queries that require full control over the storage layer. These are: +- [Index queries](/fundamentals/indexes) +- [Constraint queries](/fundamentals/constraints) +- [TTL setup queries](/querying/time-to-live) +- [Enum setup queries](/fundamentals/data-types#enum) +- [`DROP GRAPH` query](/querying/clauses/drop-graph) +- [`RECOVER SNAPSHOT` query](/database-management/backup-and-restore#restore-data) ### Deducing the accessor type needed diff --git a/pages/help-center/errors/transactions.mdx b/pages/help-center/errors/transactions.mdx index 6e97bf6fc..5d0437c5e 100644 --- a/pages/help-center/errors/transactions.mdx +++ b/pages/help-center/errors/transactions.mdx @@ -103,7 +103,7 @@ in your application to address cases where the error persists beyond the retry l ### Error message -1. **Accessor type {} and query type {} are misaligned!** +1. **Accessor type `{{}}` and query type `{{}}` are misaligned!** ### Handling transaction timeout From 101f4613b6cedf437ecfe78b83f7fc869e05ac8f Mon Sep 17 00:00:00 2001 From: Matea Pesic <80577904+matea16@users.noreply.github.com> Date: Tue, 22 Apr 2025 14:27:52 +0200 Subject: [PATCH 3/3] Update pages/help-center/errors/transactions.mdx --- pages/help-center/errors/transactions.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pages/help-center/errors/transactions.mdx b/pages/help-center/errors/transactions.mdx index 5d0437c5e..0354485b9 100644 --- a/pages/help-center/errors/transactions.mdx +++ b/pages/help-center/errors/transactions.mdx @@ -103,7 +103,7 @@ in your application to address cases where the error persists beyond the retry l ### Error message -1. **Accessor type `{{}}` and query type `{{}}` are misaligned!** +1. **Accessor type `{}` and query type `{}` are misaligned!** ### Handling transaction timeout