Skip to content

Introduce planning phase to query execution #304

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 97 commits into from

Conversation

JeffRMoore
Copy link
Contributor

This PR is intended to address issue #26 and improve developer experience around writing resolving functions.

When writing resolving functions, it can be useful to have foreknowledge of how the execution
engine will evaluate the query. This might allow the resolve function to request only the necessary
fields or to optimistically apply joins.

Currently, applying these optimizations means processing the query AST in a way parallel to
how the execution engine might do so. One must understand aliasing, fragments and directives
and the structure of the AST.

This PR exposes query evaluation foreknowledge by splitting execution into two phases,
a planning phase and an evaluation phase.

In the planning phase the AST in analyzed and a heirarchical plan structure is created indicating
how the executor will evaluate the query. The structure of this plan mirrors the structure of the schema,
not the structure of the query.

This planning information takes the place of GraphQLResolvingInfo in the calls
to resolving functions on the schema.

Pre-calculating this information serves two purposes:

  1. Provides a schema oriented interface to resolving functions predicting what will occur later.
  2. Avoids some re-calculation during query execution when evaluating list results

Examples

A set of examples are available here for various resolver function cases:

https://github.com/JeffRMoore/graphql-optimization-examples

Simple Example

from https://github.com/JeffRMoore/graphql-optimization-examples/blob/master/ex1.js

For this query on a user object

{
  hombre:user(id: "1") {
  id
    ...NameFrag
  }
}
fragment NameFrag on User {
  nombre:name
}

Here is part of a resolver function on a user field of the root query

function (source, args, info) {
  var fieldNames = Object.keys(info.returned.fields);

  console.log('WILL RESOLVE', info.fieldName, 'on', info.parentType.name);
  console.log( '    with fields', fieldNames);

  ...
}

Which produces the following output when executed

WILL RESOLVE user on Query
    with fields [ 'id', 'name' ]
RESULT:
{ hombre: { id: '1', nombre: 'Dan' } }

This shows how a resolve function can use the plan structure to perform a selection
of only the fields that will be accessed later.

Complex Example

from https://github.com/JeffRMoore/graphql-optimization-examples/blob/master/ex6.js

For this query on a user object with a nested location object

  {
    user(id: "1") {
      where: location {
        city
      }
    }
  }

Here is part of a resolver function on a user field of the root query

resolve: (source, args, info) => {
  const userFields = info.returned.fieldPlans;
  const userFieldNames = Object.keys(userFields);

  console.log('WILL RESOLVE',
    info.fieldName, 'on', info.parentType.name);
  console.log( '    with fields', userFieldNames);

  if (userFields.location) {
    userFields.location.forEach(fieldPlan => {
      const locationFields = fieldPlan.returned.fieldPlans;
      const locationFieldNames = Object.keys(locationFields);

      console.log('WILL RESOLVE',
        fieldPlan.fieldName, 'on', fieldPlan.parentType.name);
      console.log( '    with fields', locationFieldNames);
    });
  }

  ...
}

Which produces the following output when executed

WILL RESOLVE user on Query
    with fields [ 'location' ]
WILL RESOLVE location on User
    with fields [ 'city' ]
RESULT:
{ user: { where: { city: 'London' } } }

Here, we are looking ahead into the location field of user to see what fields
of location will be queried.

We have to use forEach to do this because any given field could be resolved
multiple times by the execution engine. A field can be resolved multiple times
with different arguments or multiple times with the same arguments. The execution
engine cannot make presumptions about the return value being the same or different.
(The resolver could literally be return random numbers.)

This resolver could collect the nested field set and pass a query to a backend
document store. Or it could use the location information to add a join clause to
master query for user.

The Planning Data Structures

GraphQLResolvingPlan

The GraphQLResolvingPlan structure describes the process of calling the resolve function to
resolve the value of a field on a GraphQLObjectType parent.

The GraphQLResolvingPlan data structure will be passed to the resolve function as the
third parameter, info.

type GraphQLFieldResolveFn = (
  source: mixed,
  args: {[argName: string]: mixed},
  info: GraphQLResolvingPlan
) => mixed

The first accessible plan data structure during query execution will be a GraphQLResolvingPlan
when resolve is called on the fields of the root GraphQLObjectType.

(There is a GraphQLOperationPlan that describes the operation, but it is not accessible
to functions attached to the schema.)

GraphQLResolvingPlan contains the following fields as well as the
"All Fields" and "GraphQLResolveInfo" fields described later.

Field Type Description
kind string always 'resolve' for a GraphQLResolvingPlan
fieldDefinition GraphQLFieldDefinition The definition of the field to be resolved
args { [key: string]: mixed } The arguments that will be passed to resolve
returned GraphQLCompletionPlan The plan which will be evaluated over the return value from resolve

the returned field describes the next step the execution engine will take depending
on the type of the field on GraphQLObjectType.

type GraphQLCompletionPlan =
  GraphQLSerializationPlan |
  GraphQLMappingPlan |
  GraphQLSelectionPlan |
  GraphQLCoercionPlan;
Plan Type Description
GraphQLSerializationPlan for GraphQLScalarType and GraphQLEnumType fields the return value will be serialized.
GraphQLMappingPlan for GraphQLListType the return value will be mapped, elements of the list will be further processed.
GraphQLSelectionPlan A GraphQLObjectType fields will have its fields selected.
GraphQLCoercionPlan for GraphQLUnionType or GraphQLInterfaceType fields, the return value will be coerced to the proper run time type

GraphQLSerializationPlan

GraphQLSerializationPlan describes the process of calling the serialize function
on a FieldDefinition. This is the leaf node of the planning tree.

GraphQLSerializationPlan contains the following fields as well as the
"All Fields" fields described later.

Field Type Description
kind string always 'serialize' for a GraphQLSerializationPlan

GraphQLMappingPlan

GraphQLMappingPlan describes the processing of iterating over the elements of a
GraphQLListType and completing each element value. This is an internal node in the
planning tree and is not passed directly to functions registered with the schema, but
instead is a child plan depending on the structure of the schema.

GraphQLMappingPlan contains the following fields as well as the
"All Fields" fields described later.

Field Type Description
kind string always 'map' for a GraphQLMappingPlan
listElement GraphQLCompletionPlan The plan which will be evaluated for each element in the list

GraphQLSelectionPlan

A GraphQLSelectionPlan indicates which fields will be selected from a GraphQLObjectType
as part of the return value completion process.

If the GraphQLObjectType has defined an isTypeOf function, this function will be called
before the selection operation is applied to verify that the type of the runtime value
matches the expected type. If no isTypeOf is defined, the value is presumed to be of
that type and evaluation proceeds. isTypeOf may also be called during the coercion process.
(see Below)

The GraphQLSelectionPlan data structure will be passed to the isTypeOf function as the
second parameter, info.

export type GraphQLIsTypeOfFn = (
  value: mixed,
  info?: GraphQLSelectionPlan
) => boolean

GraphQLSelectionPlan contains the following fields as well as the
"All Fields" fields described later.

Field Type Description
kind string always 'select' for a GraphQLSelectionPlan
fields {[fieldName: string]: [ GraphQLResolvingPlan ]} Maps fields that will be resolved on the to a list of GraphQLResolvingPlan.
fieldPlansByAlias {[alias: string]: GraphQLResolvingPlan} Maps object keys that will appear in the query results to a list of GraphQLResolvingPlan.

The keys of fields match the names of fields on the GraphQLObjectType so this will be the most
common way of access. fieldPlansByAlias contains the exact same plans, just with a different
organization.

Each value in fields is a list because the execution engine may attempt to resolve any given
field multiple times, if for example it had parameters or was aliased.

GraphQLCoercionPlan

A GraphQLCoercionPlan describes how to process a value of an abstract type
(GraphQLInterfaceType or GraphQLUnionType) based on the runtime type value.

A plan for each possible GraphQLObjectType type is constructed and placed in
typeChoices by type name.

A value is resolved in one of two ways:

  1. If the abstract type declares a resolveType function, that function is called and
    the name of the type is used to determine which plan to proceed with.
  2. Otherwise, a selection operation is started for each possible type, querying that
    type's isTypeOf function to determine if the value to coerce is an instance of that
    type. If no isTypeOf is defined, that plan will not be evaluated.

If the type cannot be resolved it is an error. This is usually due to an error in
constructing the schema.

The GraphQLCoercionPlan data structure will be passed to the resolve function as the
second parameter, info.

export type GraphQLTypeResolveFn = (
  value: mixed,
  info?: GraphQLCoercionPlan
) => ?GraphQLObjectType

GraphQLCoercionPlan contains the following fields as well as the
"All Fields" fields described later.

Field Type Description
kind string always 'coerce' for a GraphQLCoercionPlan
typeChoices {[typeName: string]: GraphQLSelectionPlan} A map of possible plans organized by type name.

Fields in All Plans

All plans (GraphQLResolvingPlan, GraphQLSerializationPlan, GraphQLMappingPlan,
GraphQLSelectionPlan, and GraphQLCoercionPlan) contain the following fields.

Field Type Description
kind string The kind of plan
fieldName string The name of the current field in the parent context
fieldASTs Array<Field> Portion of the AST that generated this plan
returnType GraphQLObjectType The type of the value returned after evaluating the planned operation
parentType GraphQLCompositeType The parent context on which a selection was last performed

Fields in GraphQLResolveInfo Plans

The plans that can be passed to schema functions (GraphQLResolvingPlan,
GraphQLSelectionPlan, and GraphQLCoercionPlan) contain the following additional fields.

Field Type Description
schema GraphQLSchema The schema instance that the query is being executed on
fragments { [fragmentName: string]: FragmentDefinition } Fragment definitions included in the query
rootValue mixed The root value passed to execute
operation OperationDefinition Description of operation being executed
variableValues { [variableName: string]: mixed } post processed variable values passed to execute

Open Issues

Big Diff

Sorry.

I know its hard to review something like this. See Changes section, hope that helps.

invariants vs Errors

This is the first time I've used typeflow so I'm not quire sure the invariant statements
I added are totally correct. Also, I was very confused about when to use invariant vs
throwing an error.

To me invariant is used when you expect the check to be removed in production. I'm not
sure that's the general usage in this code.

Please review this carefully, my concern is that I've hidden some error in my ignorance.

Backward compatibility

getObjectType has been moved from definitions.js to execute.js to be more like
default field resolver handling. This was necessary to avoid having isTypeOf accept
a union of types.

An invariant test is triggered if type resolution fails where before it would silently return null.

An invariant test is triggered if two types with the same name implement the same
GraphQLInterfaceType, or two types of the same name are assigned to the same GraphQLUnionType.

I left GraphQLResolvingInfo in for BC even though it isn't used any more in case someone
has referenced it.

Resolving functions

I experimented with adding a resolve function to each type of plan. This would allow
more conditional logic to be moved from the evaluation phase to the planning phase.

It also creates an interesting possibility of being able to pass the top level plan to
a transformation function which might analyze the query plan and return a DIFFERENT plan
with resolver functions replaced or wrapped to implement new behaviors.

For another time.

Changes

definition.js:

  • Introduced plan data structures: GraphQLResolvingPlan, GraphQLCompletionPlan, GraphQLSerializationPlan, GraphQLMappingPlan, GraphQLSelectionPlan, GraphQLCoercionPlan
  • Introduced GraphQLIsTypeOfFn ala GraphQLResolvingFn
  • Introduced GraphQLTypeResolveFn ala GraphQLResolvingFn
  • Changed the signature of info in GraphQLResolvingFn from GraphQLResolveInfo to GraphQLResolvingPlan
  • Changed the signature of info in GraphQLIsTypeOfFn from GraphQLResolveInfo to GraphQLSelectionPlan
  • Changed the signature of info in GraphQLTypeResolveFn from GraphQLResolveInfo to GraphQLCoercionPlan
  • Changed GraphQLResolveInfo to be the union of GraphQLResolvingFn, GraphQLIsTypeOfFn, and GraphQLTypeResolveFn
  • moved default type resolution functions getTypeOf and getObjectType to execute.js to live alongside the default field resolution function

execute.js

General

  • introduce GraphQLOperationPlan type
  • change execute to call planOperation function
  • change the signature of defaultResolveFn to accept a plan
  • Add findTypeWithResolveType method to support Coercion operation
  • Add findTypeWithIsTypeOf method to support Coercion operation using logic from getTypeOf from definition.js

executeOperation

  • introduce planOperation analog to executeOperation that produces a GraphQLOperationPlan
  • change executeOperation signature accept a GraphQLOperationPlan instead of an OperationDefinition
  • move call to collectFields in executeOperation to planOperation

executeFields and executeFieldsSerially

  • introduce planFields analog to executeFields and executeFieldsSerially that produces a fieldPlans and fieldsList results
  • removed parentType from signature of executeFields and executeFieldsSerially since this is now part of the plan
  • changed the fields parameter to executeFields and executeFieldsSerially to be an map of GraphQLResolvingPlan instead of Array

resolveField

  • introduce planResolveField analog to resolveField that returns a GraphQLResolvingPlan
  • added a plan parameter of type GraphQLResolvingPlan to resolveField
  • remove parentType and fieldASTs parameters from resolveField since these are part of the plan
  • Move FieldDefinition lookup from resolveField to planResolveField
  • Move call to getArgumentValues from resolveField to planResolveField
  • Remove construction of GraphQLResolveInfo from resolveField

resolveOrError

  • Change the info parameter on resolveOrError to accept a GraphQLResolvingPlan instead of GraphQLResolveInfo

completeValueCatchingError

  • Change the signature of completeValueCatchingError to accept a GraphQLCompletionPlan instead of GraphQLResolveInfo
  • Remove fieldASTs field from completeValueCatchingError, this information is now in the plan

completeValue

  • introduce planCompleteValue analog to completeValue that returns a GraphQLCompletionPlan
  • Change the signature of completeValue to accept a GraphQLCompletionPlan instead of GraphQLResolveInfo
  • move call to collectFields from completeValue to planSelection
  • Refactor coercion logic to be based on pre-calculated plans for each type

…utionContext type can be exported, enabling plan building to be a separate module.
…ve of what it is. Frees up ExecutionPlan for more generic uses
@JeffRMoore
Copy link
Contributor Author

Off the top of my head, maybe union and interface should be treated differently? In an interface, the selection plan isn't really different, except possibly for returnType, which I'm confused about anyway in this case. Maybe the typeChoices for a union type can be restricted based on what is discovered during the collectFields phase?

@denvned
Copy link
Contributor

denvned commented Mar 21, 2016

There would be a new worst case where the number of plan nodes can grow out of proportion, if one adds lists of Node, where the members of the list can vary in type.

A list of Nodes shouldn't be more problematic than a simple Node field, because you still have only one coercion plan node per whole list.

maybe union and interface should be treated differently?

Actually, unions are not easier than interfaces. Consider a schema:

union Node = T1 | T2 | <...> | T10

type T1 { id: ID!, child: Node }
type T2 { id: ID!, child: Node }
<...>
type T10 { id: ID!, child: Node }

type Query { root: Node }

And a query:

query {
  root { ...level1 }
}

fragment level1 on Node {
  ... on T1 { child { ...level2 } }
  ... on T2 { child { ...level2 } }
  <...>
  ... on T10 { child { ...level2 } }
}

fragment level2 on Node {
  ... on T1 { child { ...level3 } }
  ... on T2 { child { ...level3 } }
  <...>
  ... on T10 { child { ...level3 } }
}

<...>

fragment level10 on Node {
  ... on T1 { id }
  ... on T2 { id }
  <...>
  ... on T10 { id }
}

Here again, without memoization and laziness, we have to calculate and store in RAM more than 1010=10,000,000,000 plan nodes...

Maybe the typeChoices for a union type can be restricted based on what is discovered during the collectFields phase?

This won't solve the problem, because selection plans for skipped types would be empty anyway. Besides, it might be useful to be able to get a selection plan for a possible type even if it is empty.

@leebyron
Copy link
Contributor

There is definitely extra overhead if plans are created for typeChoices and then not used, especially if there are many typeChoices.

Unfortunately this is a common case. For example, Relay apps require many types to implement an interface Node { id: ID } and it's not uncommon to see fields which return type Node in such schema.

@JeffRMoore
Copy link
Contributor Author

I'm convinced, the plan size needs to be proportional to query complexity. I think this is doable but will be conceptually more complex.

@helfer
Copy link
Contributor

helfer commented Apr 9, 2016

@JeffRMoore I'm curious about the performance metrics you quoted, and would love to see what the resolve functions look like that you wrote for it. Do you have a repo for that where I could reproduce them? I'm guessing these were for some flavor of SQL?
Also, I'm trying to understand where and why the current executor is slower than your solution. What parts of the query are executed differently with your query planner?

And finally, sorry for the barrage of questions, but this looks really interesting and I want to make sure I understand it.

@JeffRMoore
Copy link
Contributor Author

@helfer At this point you should disregard the actual metrics reported. I've abandoned my original methodology. My tests were not against any particular backend. Those times were only for the time consumed by the execute process itself. I used a memory generated data structure.

You can see my current work here: https://github.com/JeffRMoore/graphql-js/tree/benchmarks

I'm guessing it will be ready to submit by end of next weekend.

I'm standing in front of a herd of naked yaks on this one. Ended up writing a performance regression test library, which is why I've been quiet here.

https://github.com/JeffRMoore/async-benchmark-runner

The main difference in performance stems largely from the current implementation having to re-calculate collectFields on every iteration through the list.

@ruslantalpa
Copy link

@JeffRMoore looked a bit at this PR. Am i right in saying that the core idea here is to "inline" fragments, variables and interfaces into the AST so that resolvers don't have to deal with that (and maybe provide the fieldAST in a slightly more digestible form)?

@JeffRMoore
Copy link
Contributor Author

@ruslantalpa Instead of inlining into the AST, I would say that the idea is to introduce a separate intermediate representation that is tailored to the process of resolving (an execution plan). The idea is absolutely that resolve functions should not have to digest fieldASTs and fragments, and such.

@Globegitter
Copy link

@JeffRMoore Found this proposal not too long ago and this seems an addition really useful to some of our use-cases. What is the current state of the PR? Not too easily to figure out through this discussion.

@syrusakbary
Copy link

Would love to see this in the main codebase! Any updates?

@JeffRMoore
Copy link
Contributor Author

Sorry, got distracted, will continue to be for at least two more weeks. Still planning to return to this.

@mvgijssel
Copy link

Thanks for all the great work! Would also love to see this shipped, as it would definitely help us!

@JeffRMoore
Copy link
Contributor Author

I'm going to go ahead and close this out for the following reasons:

  • The implementation here is fatally flawed with respect to abstract types as noted above.
  • The fix for this flaw would involve a BC break in the resolver interface that would be traumatic and I think represents acceptable tradeoffs.
  • I've come to believe that my premise about mixing query and schema abstractions, while true, is not appropriate to address in the reference implementation because I've discovered a worthwhile counter-case: proxying a graphql api.
  • I want to see what the Relay 2 guys produce with their new @ directives before expending more effort in a direction that may not be productive (wish there was more visibility there).
  • This branch is hopelessly out of date with respect to the current version of GraphQL, I'd have to start from scratch anyway.
  • I want to work on other GraphQL contributions and I kinda messed up the mechanics of how I submitted the PR, so closing it makes my workflow easier.

Thanks for your attention. Regards.

@JeffRMoore JeffRMoore closed this Feb 4, 2017
@leebyron
Copy link
Contributor

leebyron commented Feb 4, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.