Skip to content

Update storage access types #1237

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 22, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 18 additions & 2 deletions pages/client-libraries/python.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -435,7 +435,7 @@ for record in records:
print(path.end_node)
```

Path will contain [Nodes](#process-the-node-result) and [Relationships[#process-the-relationship-result], that can be accessed in the same way as in the previous examples.
Path will contain [Nodes](#process-the-node-result) and [Relationships](#process-the-relationship-result), that can be accessed in the same way as in the previous examples.

### Transaction management

Expand Down Expand Up @@ -546,6 +546,11 @@ With sessions, you can run:

To create a managed transaction, use `Session.execute_read()` procedure for read queries and `Session.execute_write()` procedure for write queries.

<Callout type="info">
As of Memgraph version 3.2, queries are categorized as **read** or **write** and the corresponding storage access is taken. This allows for better query parallelization and higher throughput.
An exception will be thrown if the user tries to execute a write query inside a read transaction. For more details, see [transaction accessor misalignment](/fundamentals/transactions#transaction-accessor-misalignment).
</Callout>

```python
def match_user(tx, name):
result = tx.run(
Expand Down Expand Up @@ -581,6 +586,12 @@ To maintain multiple concurrent transactions, use [multiple concurrent sessions]
With explicit transactions, you can get **complete control over transactions**. To begin a transaction, run `Session.begin_transaction()` procedure and to run a transaction, use `Transaction.run()` procedure.
Explicit transactions offer the possibility of explicitly controlling the end of a transaction with `Transaction.commit()`, `Transaction.rollback()` or `Transaction.close()` methods.

<Callout type="info">
As of Memgraph version 3.2, queries are categorized as **read** or **write** and the corresponding storage access is taken. This allows for better query parallelization and higher throughput.
Explicit transactions can cover a number of individual queries, but storage access is given at the start. For best performance, the user needs to declare whether the transaction should use read or write access.
This can be done by setting the session's `default_access_mode` to `"r"` or `"w"`. This will, in turn, set the access mode of a transaction created via the `begin_transaction` function. Note that `execute_read` and `execute_write` will override the session's default access.
</Callout>

Use explicit transaction if you need to **distribute Cypher execution across multiple functions for the same transaction** or if you need to **run multiple queries within a single transactions without automatic retries**.

The following example shows how to explicitly control the transaction of changing account balances based on a token transfer:
Expand Down Expand Up @@ -610,7 +621,7 @@ def create_users(client, sender, receiver):


def transfer_tokens(client, sender_id, receiver_id, num_of_tokens):
with client.session(database="memgraph") as session:
with client.session(database="memgraph", default_access_mode="w") as session:
tx = session.begin_transaction()

try:
Expand Down Expand Up @@ -676,6 +687,11 @@ In the above example, if John's account balance is changed to a number less than
Implicit or auto-commit transactions are the simplest way to run a Cypher query since they won't be automatically retried as with `execute_query()` procedure or managed transactions.
With implicit transactions, you don't have the same control of transaction as with explicit transactions, so they are mostly used for quick prototyping.

<Callout type="info">
As of Memgraph version 3.2, queries are categorized as **read** or **write** and the corresponding storage access is taken. This allows for better query parallelization and higher throughput.
Access mode is automatically determined when executing single queries through implicit transactions.
</Callout>

To run an implicit transaction, use the `Session.run()` method:

```python
Expand Down
1 change: 1 addition & 0 deletions pages/fundamentals/_meta.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ export default {
"data-types": "Data types",
"data-durability": "Data durability",
"indexes": "Indexes",
"storage-access" : "Storage access",
"storage-memory-usage": "Storage memory usage",
"telemetry": "Telemetry",
"transactions": "Transactions",
Expand Down
57 changes: 57 additions & 0 deletions pages/fundamentals/storage-access.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
title: Storage access
description: Understand how Memgraph access the storage layer. A detailed resource to optimize multi-client throughput.
---

import { Callout } from 'nextra/components'

# Storage access

The storage (or storage layer) refers to all data associated with the graph itself.
This means vertices, edges, their properties, labels, and other data.

Queries that read from or write to the graph are considered to be accessing the storage. These storage accesses are managed through `storage accessors`, which guarantee a transactional view and maintain concurrency safety.

## Storage accessors

There are 3 types of accessors:
- **Shared access**: Allows multiple queries to run in parallel, marked as either read or write.
- **Read-only access**: Permits multiple read queries to run in parallel but forbids any write operations or queries requiring unique access.
- **Unique access**: Grants exclusive access to a single query, preventing any other type of access during its execution.

**Shared access** is the most common access granted. Any data oriented Cypher query will use it.

**Read-only access** is currently used only by `CREATE SNAPSHOT` when in `IN_MEMORY_ANALYTICAL` storage mode. Using the read-only access guarantees that the snapshot is consistent, while also allowing for other (shared access) read queries to run in parallel.

**Unique access** queries are used by queries that require full control over the storage layer. These are:
- [Index queries](/fundamentals/indexes)
- [Constraint queries](/fundamentals/constraints)
- [TTL setup queries](/querying/time-to-live)
- [Enum setup queries](/fundamentals/data-types#enum)
- [`DROP GRAPH` query](/querying/clauses/drop-graph)
- [`RECOVER SNAPSHOT` query](/database-management/backup-and-restore#restore-data)

### Deducing the accessor type needed

The type is deduced at parsing time automatically.
Read-only and unique accesses are given based on the query type (as described in the previous section).
The shared access needs to additionally mark a query as read or write. This is also done automatically at parse-time.

The only instance where the user needs to explicitly specify the desired shared access type is when creating a managed (explicit) transactions.
These transactions acquire and hold the storage accessor at the start of their execution.
By default a write shared access is taken, but this can limit which queries can run in parallel. For the best performance, it is recommended to mark transactions with the desired access type.
For more details, refer to [Transactions](/fundamentals/transactions).

## Queries that do not require storage access

Queries that do not read or modify any graph data do not need storage access.
These queries are:
- Auth queries
- Multi-tenant queries
- Replication queries
- Show config queries
- Setting queries
- Version queries
- Transaction queue queries

Please note that even if these queries do not access the storage, they still might be accessing a shared resource and could block or throw if called in parallel.
32 changes: 28 additions & 4 deletions pages/help-center/errors/transactions.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,28 @@ While some client drivers may handle serialization errors by retrying transactio
developers should not rely solely on this mechanism. Always include comprehensive error handling
in your application to address cases where the error persists beyond the retry logic.

## Transaction accessor misalignment

### Error message

1. **Accessor type `{}` and query type `{}` are misaligned!**

### Handling transaction timeout

Transactions in Memgraph must acquire the appropriate type of storage access at the start of their execution.

This access can be one of the following types:
- **Shared access**: Allows multiple queries to run in parallel, marked as either read or write.
- **Read-only access**: Permits multiple read queries to run in parallel but forbids any write operations or queries requiring unique access.
- **Unique access**: Grants exclusive access to a single query, preventing any other type of access during its execution.

For more information regarding storage access, please refer to [Storage access](/fundamentals/storage-access).

While single queries can be parsed and the correct type of storage access can be determined automatically by Memgraph, this is not the case for explicit (managed) transactions.
In managed transactions, the database cannot infer the required access type in advance because the transaction's operations are not know at the beginning.
This can lead to storage access misalignment if the requested access type does not match the operations being performed.

See appropriate driver's documentation for more information on how to define transaction's type.

## Transaction timeout

Expand All @@ -119,15 +141,17 @@ Here are the [instructions](/configuration/configuration-settings#using-flags-an

Here are the storage access error messages you might encounter:

1. **Cannot access storage, unique access query is running. Try again later.**
1. **Cannot get shared access storage. Try stopping other queries that are running in parallel.**
2. **Cannot get unique access to the storage. Try stopping other queries that are running in parallel.**
3. **Cannot get read only access to the storage. Try stopping other queries that are running in parallel.**

### Understanding storage access timeout

Storage access timeouts occur during query preparation when the query execution engine cannot get the required type of access to the storage. There are two types of storage access:
Storage access timeouts occur during query preparation when the query execution engine cannot get the required type of access to the storage. There are three types of storage access:

- **Shared access**: Multiple queries can have shared access at the same time, but shared access cannot be granted while a query with unique access is running.
- **Unique access**: Only one query can have unique access at a time, and no other query can have any type of access during that period.
- **Shared access**: Multiple queries can have shared access at the same time. These queries are marked with a read or write type, allowing Memgraph to efficiently execute multiple operations in parallel without conflicts.
- **Unique access**: Only one query can have unique access at a time, and no other access type can be granted during that period.
- **Read-only access**: Queries with read-only access allow other read queries to run in parallel but forbid any write operations or unique access queries.

These timeouts prevent worker starvation and database blocking that could occur if queries were to wait indefinitely for storage access.

Expand Down