-
Notifications
You must be signed in to change notification settings - Fork 764
Adding select_related and prefetch_related optimizations #220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding select_related and prefetch_related optimizations #220
Conversation
0defa3a
to
629abd0
Compare
I added a few high-level tests to assert query counts. All other tests are passing. If you like this improvement in general, I will whip up some docs. Also let me know if there is anything else that you would like tests written for. |
Hi! This looks like a really good start, I personally would be keen to see something like this merge. A couple of notes: Personally, I tend to prefer the prefetch in most cases. It should certainly be possible to override this behaviour on certain FK links, as the joins can become quite expensive. My version of this problem ignored select related completely. I've found it necessary to be able to apply more complex optimisations than just a simple My project is starting to grow some microservices, and I recently had cause to add a I'm concerned about your field extraction - I have a suspicion that it won't handle fragments correctly. For users of Relay, almost every request will include an extensive number of fragments, so extracting the actual requested fields from these is important. I know the documentation now recommends a |
Thanks for the input! I will definitely take a look at how fragments show up in the AST. That is certainly something I did not account for. It may not be immediately apparent at a first glance, but the optimization does follow child dependencies and select/prefetch as much as can be inferred by the Model metas. So a query like: query {
topObject {
firstRelations {
edges {
node {
id
nestedOneToOne {
id
attribute
}
}
}
}
firstOneToOne {
id
otherAttr
anotherOneToOne {
id
}
}
}
} will result in I will admit to not knowing much about the underlying What is your reason for avoiding I played with the data-loader a little. It could potentially be useful for some things, I'm sure. My goal with this PR is make simple inferable optimizations based on matches between the graphql ast and Model meta information. |
My main concern is that overactive inference here, without the ability to customise it further, could clobber existing performance optimisations being applied by users. |
Awesome. Those area all fantastic points. I combed over your gist and I'm pretty sure I understand how it's working now. It took me a moment to figure out how you were handling nested prefetches, but I see now that Would it be okay if I incorporate parts of that gist into this PR to achieve the desired api? I'm currently aware of the following changes that need to be made.
|
Go for it, that code is absolutely fine to use. (Do note I'm not a contributor to this repo, so we'll have to see whether @syrusakbary likes the ideas) |
bdc7189
to
f93251b
Compare
This is awesome! We have a home grown graphene-like piece of code that I was looking to replace with Graphene but I was concerned about N+1 queries. Would this be compatible with 1.7? |
@MichaelrMentele I'm guessing you mean Django 1.7? I'm not 100% positive, as the Travis build for graphene-django now only runs for 1.8+ |
What's the status on this? Fixing the N+1 query problem would be a huge improvement. |
@spockNinja, @mjtamlyn guys, take a look at this approach. And here is an example of schema: https://github.com/cinemanio/backend/blob/master/cinemanio/api/schema/movie.py#L33-L39. |
Is there an example somewhere of how to implement these optimisations manually? |
@ramusus Love your implementation. Would be great if it also allowed for prefetch filters using Django's |
I have some hacks that I've done to my schema to have preloading via using a |
@spockNinja Are you still working on this PR or would you prefer if somebody else took over? |
model_fields = model_fields_as_dict(model) | ||
selections = find_model_selections(graphql_ast) | ||
|
||
graphene_obj_type = REGISTRY.get_type_for_model(model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the behaviour of this when there are multiple types mapped to the same model? As far as I remember that case isn't prohibited.
|
||
REGISTRY = get_global_registry() | ||
SELECT = 'select' | ||
PREFETCH = 'prefetch' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could make this an Enum
instead.
@@ -75,6 +76,7 @@ def connection_resolver(cls, resolver, connection, default_manager, max_limit, | |||
data=filter_kwargs, | |||
queryset=default_manager.get_queryset() | |||
).qs | |||
qs = optimize_queryset(qs, info) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that currently these optimisations will be lost when line 66 in this file runs. That intersects the filtered queryset here with the resolved one, and loses the optimisations in the process.
We've "solved" this by instead resolving in this method, then passing the resolved queryset to the filterset (defaulting to the default_manager.get_queryset()
if the resolver doesn't give us a value), and then dropping the merge behaviour above. This may or may not work in the general case, I really don't know, but it's working for us (we've implemented a similar function to your optimize_queryset
).
if iterable is None: | ||
iterable = default_manager | ||
iterable = maybe_queryset(iterable) | ||
if isinstance(iterable, QuerySet): | ||
if iterable is not default_manager: | ||
default_queryset = maybe_queryset(default_manager) | ||
iterable = cls.merge_querysets(default_queryset, iterable) | ||
iterable = optimize_queryset(iterable, info) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly, this optimisation does work because it's come after the call to merge_querysets
!
@spockNinja Still interested in this pull request? |
I'd like to tidy up some of the PRs, so if we still want to work on this let me know or I will close this in 1 week. |
@spockNinja @phalt I'd like to raise this PR from the dead. Did you end up introducing a similar mechanism eventually? We are looking to find the most elegant way to reduce the # of SQL queries our Graphene-Django instance is sending to the DB atm. |
Using a combination of Model meta and the graphql AST, we are able to use django's
queryset.select_related()
andqueryset.prefetch_related()
to reduce the number of queries run during a graphql query resolution.These optimizations help a lot with the N+1 problem faced when using an ORM.
I have written the primary entry point (
optimizations.optimize_queryset
) such that it can be imported and used in any graphql resolution function. (Seeresolve_article
in the tests). I then used it in several core places, includingDjangoConnectionField
,DjangoListField
, andDjangoObjectType
.There will always be cases when the graphql representation does not match the model meta, either an alias or perhaps a complicated model
@property
. In those cases, you can specify manual optimizations in theDjangoObjectType
Meta. Those optimizations look like:Closes #57