feat: Implementation of run partition query #1080

ankiaga · 2024-01-18T09:02:45Z

No description provided.

google/cloud/spanner_dbapi/client_side_statement_parser.py

google/cloud/spanner_dbapi/connection.py

google/cloud/spanner_dbapi/cursor.py

google/cloud/spanner_v1/merge_result_set.py

tests/system/test_dbapi.py

tests/system/test_session_api.py

google/cloud/spanner_v1/database.py

google/cloud/spanner_v1/merge_result_set.py

olavloite · 2024-01-19T11:22:05Z

google/cloud/spanner_v1/merge_result_set.py

+
+QUEUE_SIZE_PER_WORKER = 32
+MAX_PARALLELISM = 100
+METADATA_LOCK = Lock()


This is now a global lock, right? Would it be possible to make it an instance variable for the MergedResultSet class?

Any reason why we want it to be an instance of MergedResultSet because it is not used by MergedResultSet class but used just by static _set_metadata() method

This means that if you have multiple MergedResultSets open, then they will block each other, which is not necessary. So a more correct design would be to have the lock as an instance variable, and the _set_meta_data as an instance method. The goal of this lock is to prevent multiple threads from setting/reading the metadata field of a specific MergedResultSet at the same time, not to prevent different MergedResultSets from setting their respective metadata fields.

I agree that it is a bit theoretical, as it is unlikely that a user will have a large number of MergedResultSets open at the same time, but putting the lock where it is actually needed will make the code easier to read and understand. Now it seems like it would be a problem if two different MergedResultSets try to set their metadata at the same time.

SG, Changed

google/cloud/spanner_v1/merge_result_set.py

olavloite

Looks generally good to me, with a couple of nits on the setting and locking for metadata.

olavloite · 2024-01-24T07:13:21Z

google/cloud/spanner_v1/merge_result_set.py

+
+QUEUE_SIZE_PER_WORKER = 32
+MAX_PARALLELISM = 100
+METADATA_LOCK = Lock()


This means that if you have multiple MergedResultSets open, then they will block each other, which is not necessary. So a more correct design would be to have the lock as an instance variable, and the _set_meta_data as an instance method. The goal of this lock is to prevent multiple threads from setting/reading the metadata field of a specific MergedResultSet at the same time, not to prevent different MergedResultSets from setting their respective metadata fields.

I agree that it is a bit theoretical, as it is unlikely that a user will have a large number of MergedResultSets open at the same time, but putting the lock where it is actually needed will make the code easier to read and understand. Now it seems like it would be a problem if two different MergedResultSets try to set their metadata at the same time.

olavloite · 2024-01-24T07:25:45Z

google/cloud/spanner_v1/merge_result_set.py

+            if merged_result_set._metadata is None:
+                _set_metadata(merged_result_set, results)
+        except Exception as ex:
+            self._queue.put(PartitionExecutorResult(exception=ex))


We should also call _set_metadata here (if it has not already been set) to prevent the metadata property from blocking indefinitely if someone tries to call that after an error has occurred. If for example the query fails for all partitions, then the user will get a MergedResultSet that returns an error whenever you try to iterate over the rows, but that hangs forever if you try to call metadata. The latter is always very hard to debug, so we should whenever possible return an error instead of block when something goes wrong.

google/cloud/spanner_v1/merge_result_set.py

feat: Implementation of run partition query

7902089

ankiaga requested review from a team as code owners January 18, 2024 09:02

product-auto-label bot added api: spanner Issues related to the googleapis/python-spanner API. size: l Pull request size is large. labels Jan 18, 2024

ankiaga requested review from aseering, olavloite and manu2 January 18, 2024 09:03

olavloite reviewed Jan 18, 2024

View reviewed changes

Comments incorporated

344e496

olavloite reviewed Jan 19, 2024

View reviewed changes

Comments incorporated

d9c4ac0

olavloite approved these changes Jan 24, 2024

View reviewed changes

ankiaga and others added 2 commits January 24, 2024 14:42

Comments incorporated

fe674e8

Merge branch 'main' into partition_run

2cace17

ankiaga merged commit f3b23b2 into googleapis:main Jan 24, 2024

release-please bot mentioned this pull request Jan 24, 2024

chore(main): release 3.42.0 #1079

Merged

feat: Implementation of run partition query #1080

feat: Implementation of run partition query #1080

Uh oh!

Conversation

ankiaga commented Jan 18, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

olavloite Jan 19, 2024

Choose a reason for hiding this comment

Uh oh!

ankiaga Jan 19, 2024

Choose a reason for hiding this comment

Uh oh!

olavloite Jan 24, 2024

Choose a reason for hiding this comment

Uh oh!

ankiaga Jan 24, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

olavloite left a comment

Choose a reason for hiding this comment

Uh oh!

olavloite Jan 24, 2024

Choose a reason for hiding this comment

Uh oh!

olavloite Jan 24, 2024

Choose a reason for hiding this comment

Uh oh!

ankiaga Jan 24, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!