-
Notifications
You must be signed in to change notification settings - Fork 98
fix: improve client-side regex statement parser #1328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The client-side regex-based statement parser contained multiple minor errors, like: - BEGIN would match any string as BEGIN TRANSACTION (including stuff like `BEGIN foo`) - COMMIT and ROLLBACK had the same problem as BEGIN. - Mismatches were reported as UPDATE. They are now returned as UNKNOWN. - DLL missed the ANALYZE keyword
if ( | ||
parsed_statement.statement_type != StatementType.UPDATE | ||
and parsed_statement.statement_type != StatementType.INSERT | ||
and parsed_statement.statement_type != StatementType.UNKNOWN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This provides a fallback if the statement is not recognized by the client-side parser. Instead of failing directly, we allow it to be sent to Spanner and let the Spanner parser determine whether it is valid or not. This prevents potential new DML syntax in the future from being rejected here (e.g. if we were to support merge
statements).
RE_BEGIN = re.compile(r"^\s*(BEGIN|START)(TRANSACTION)?", re.IGNORECASE) | ||
RE_COMMIT = re.compile(r"^\s*(COMMIT)(TRANSACTION)?", re.IGNORECASE) | ||
RE_ROLLBACK = re.compile(r"^\s*(ROLLBACK)(TRANSACTION)?", re.IGNORECASE) | ||
RE_BEGIN = re.compile(r"^\s*(BEGIN|START)(\s+TRANSACTION)?\s*$", re.IGNORECASE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lack of $
at the end of these regular expressions meant that statements like begin whatever string follows does not matter
to be accepted as valid statements.
@@ -702,10 +698,6 @@ def set_autocommit_dml_mode( | |||
self._autocommit_dml_mode = autocommit_dml_mode | |||
|
|||
def _partitioned_query_validation(self, partitioned_query, statement): | |||
if _get_statement_type(Statement(partitioned_query)) is not StatementType.QUERY: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skip validation here and let Spanner return an error if the statement is not valid for PartitionQuery
.
RE_NON_UPDATE = re.compile(r"^\W*(SELECT|GRAPH|FROM)", re.IGNORECASE) | ||
|
||
RE_WITH = re.compile(r"^\s*(WITH)", re.IGNORECASE) | ||
|
||
# DDL statements follow | ||
# https://cloud.google.com/spanner/docs/data-definition-language | ||
RE_DDL = re.compile( | ||
r"^\s*(CREATE|ALTER|DROP|GRANT|REVOKE|RENAME)", re.IGNORECASE | re.DOTALL | ||
r"^\s*(CREATE|ALTER|DROP|GRANT|REVOKE|RENAME|ANALYZE)", re.IGNORECASE | re.DOTALL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ANALYZE
is a valid DDL statement in Spanner.
re.IGNORECASE | re.DOTALL, | ||
) | ||
"""Deprecated: Use the RE_IS_INSERT, RE_IS_UPDATE, and RE_IS_DELETE regexes""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a new deprecation, these regular expressions were already 'deprecated' in the sense that:
- They were no longer in use by the library
- They are not intended for public use
The client-side regex-based statement parser contained multiple minor errors, like:
BEGIN foo
)