-
Notifications
You must be signed in to change notification settings - Fork 227
Limiting SQL query to defined fields/columns #134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think right now, you have to inspect the query AST(
or something similar. See graphql-python/graphene#348 for ideas about how to implement Of course this is a simplification, since this only works for the first level of the query. You'd actually have to also look at relationships, recursively, and apply the
but this still isn't right, it only takes care of the first level of relationships. The http://docs.sqlalchemy.org/en/latest/orm/loading_relationships.html |
Thanks a lot @DrPyser that was a very helpful response! I will give it a go now and update the issue. |
Hey @DrPyser. I tried the implementation of For example, the return of the aforementioned implementation against this query:
will return Is there a more concrete way of performing this mapping? Graphene/GrapheneSQLAlchemy does it somehow when mapping ORM fields to GraphQL fields so I was hoping to leverage that somehow. |
Yup, look in |
That's beautiful, thanks a lot @DrPyser. |
Should anyone land here wondering about implementation I got around to implementing my own version of the
which parses the AST into a Running the
produces
while a more complex query like this:
produces:
Now these dictionaries can be used to define what is a field on the primary-table ( A simplistic approach to auto-applying those fields can take the form of the following function:
|
Is this still the only way to do it? even with the new version out? |
@aonamrata I haven't seen any new approach being available. Which new version are you referring to? |
never mind, it does not have it. I saw couple of PRs to add it and was assuming it should be in the next version .. after 2.0. |
I know this is closed, but should the code snippet @somada141 provided be merged into this library? I believe this is the missing piece that would make using SQLAlchemy + graphene + graphene-sqlalchemy a truly beautiful thing. |
Yeah, this is a fairly complex problem and solution, you don't want everyone to implement their own version. This should also be a basic concern of a library like this one. It might help if the graphene or graphql-core libraries provided more utilities to manipulate the AST. |
Yeah I agree that something like that should be merged in to this library at some point. Ideally we should write some sort of design document before we delve into the actual implementation because there are many considerations and edge cases to take into account. For example, I think the N+1 issue is somewhat orthogonal to this problem and could be solved with different approaches. I'm currently working on improving the out-of-the-box performance of |
Hi, just a heads up that I sent a PR to address the N+1 problem. I ended up following a different approach than @somada141 because one of my goals was that this optimization should work well with schemas that include non- |
Hi, I didn't find full answer, so I combined answers and made solution for "n+1", but "cartesian product" appears.
def extract_requested_fields(
info: graphql.execution.base.ResolveInfo,
fields: List[Field],
do_convert_to_snake_case: bool = True,
) -> Dict:
result = {}
for field in fields:
key = field.name.value
if do_convert_to_snake_case:
key = to_snake_case(name=key)
if key == "id":
continue
val = None
if isinstance(field, Field):
if (
hasattr(field, "selection_set") and
field.selection_set is not None
):
val = extract_requested_fields(
info=info,
fields=field.selection_set.selections,
do_convert_to_snake_case=do_convert_to_snake_case
)
if val and 'edges' in val:
val = val['edges']
if val and 'node' in val:
val = val['node']
result[key] = val
elif isinstance(field, FragmentSpread):
fragment = info.fragments[field.name.value]
val = extract_requested_fields(
info=info,
fields=fragment.selection_set.selections,
do_convert_to_snake_case=do_convert_to_snake_case
)
result = val
return result def makeLoadOnlyOptions(fieldsDict, model):
if not fieldsDict:
return []
options = []
modelFields = []
relationshipFields = []
for field in list(fieldsDict):
if fieldsDict[field] is None:
modelFields.append(field)
else:
relationshipFields.append(field)
if modelFields:
options.append(load_only(*modelFields))
for relationshipField in relationshipFields:
relationshipOptions = makeLoadOnlyOptions(fieldsDict[relationshipField],
getattr(model, relationshipField).property.mapper.class_)
options.append(joinedload(getattr(model, relationshipField)).options(*relationshipOptions))
return options class LoadOnlyConnectionField(graphene_sqlalchemy.SQLAlchemyConnectionField):
def __init__(self, connection, *args, **kwargs):
super().__init__(connection, *args, **kwargs)
@classmethod
def get_query(cls, model, info: 'ResolveInfo', sort=None, **args):
query = super().get_query(model, info, sort, **args)
queryName = info.field_name
fieldsDict = extract_requested_fields(info, info.field_asts, False)
fieldsDict = fieldsDict[queryName]
options = makeLoadOnlyOptions(fieldsDict, model)
query = query.options(*options)
# print(query)
return query class Query(graphene.ObjectType):
node = relay.Node.Field()
searchTitlesRelay = LoadOnlyConnectionField(connection=Title, sort=Title.sort_argument()) {
searchTitlesRelay{
edges {
cursor
node {
uuid
code
titleAuthors {
edges {
node {
uuid
user {
uuid
name
}
}
}
}
chapters {
edges {
node {
uuid
}
}
}
}
}
}
} SELECT anon_1.title_id AS anon_1_title_id, anon_1.title_code AS anon_1_title_code,
title_author_1.id AS title_author_1_id,
users_1.id AS users_1_id, users_1.name AS users_1_name,
chapter_1.id AS chapter_1_id
FROM (
SELECT title.id AS title_id, title.code AS title_code
FROM title ORDER BY title.id ASC
LIMIT 2
) AS anon_1
LEFT OUTER JOIN title_author AS title_author_1 ON anon_1.title_id = title_author_1.title_id
LEFT OUTER JOIN users AS users_1 ON users_1.id = title_author_1.user_id
LEFT OUTER JOIN chapter AS chapter_1 ON anon_1.title_id = chapter_1.title_id
ORDER BY anon_1.title_id ASC |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related topics referencing this issue. |
Consider the following SQLAlchemy ORM class:
Simply wrapped in an
SQLAlchemyObjectType
as such:and exposed through:
A GraphQL query such as:
will cause the following raw SQL to be emitted (taken from the echo logs of the SQLA engine):
As one can see we may only want the
nameFirst
field, i.e., thename_first
column but the entire row is fetched. Of course the GraphQL response only contains the requested fields, i.e.,but we have still fetched the entire row, which becomes a major issue when dealing with wide tables.
Is there a way to automagically communicate which columns are needed to SQLAlchemy so as preclude this form of over-fetching?
The text was updated successfully, but these errors were encountered: