Skip to content

Support quoted column identifiers for scan row_filter string argument #1863

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 4, 2025

Conversation

norton120
Copy link
Contributor

Rationale for this change

Our data lake uses old-school Kimball style quoted column names ("User ID", "Customer Name" etc). The string parser for row_filter was unable to parse this. Now it is.

example:

# before 
>> parser.parse(' "User Name" = 'ted')
ParseException: Expected '"', found ' '

# after
>> parser.parse(' "User Name" = 'ted')
EqualTo("User Name", "ted")

# Are these changes tested?
Yes a new test was added. 

Note

The quoted_column_with_dots previously errored with "Expected '"', found '.'" when using double quotes only. It now raises error text expecting an 'or' value; I didn't toil over finding where the exception is clobbered, because the error message between single and double quote exceptions is inconsistent and I didn't really consider this a polished/first-class error message. If this change is an issue, I can dig further to try and revert the wording change; IMO raising the same exception type is more than reasonable to consider the change non-breaking.

Are there any user-facing changes?

Yes quoted identifiers are now supported

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @norton120 for adding this. This PR makes sense to me. Sorry for the late reply, I wanted to double check if double quotes are the standard, and it looks like that's the case.

@Fokko Fokko merged commit 8adf246 into apache:main Apr 4, 2025
7 checks passed
@norton120 norton120 deleted the allow-quoted-identifiers branch April 15, 2025 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants