Skip to content

Load types in schema on demand #69

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vladar opened this issue Nov 5, 2016 · 10 comments
Closed

Load types in schema on demand #69

vladar opened this issue Nov 5, 2016 · 10 comments

Comments

@vladar
Copy link
Member

vladar commented Nov 5, 2016

Currently all types (including all fields) are loaded on schema instance creation.

It is suboptimal for PHP, but it works this way because in some cases we need to know all interface implementors (for validation purposes - both in Validator and in Executor). And the only way to find them is to loop through all types and check if type implements given interface. Hence all types must be loaded for interface-related checks.

The goal of this task is to find a way for lazy-loading types in schema. This is possible, yet requires additional hints from user-land code to work efficiently (e.g. #40).

@vladar
Copy link
Member Author

vladar commented Jan 19, 2017

This was added as experimental feature in v0.9.0, but unfortunately it doesn't work out of the box: eager type loading is still the default.

One might override it by providing typeResolution option to Schema which must be an instance implementing GraphQL\Type\Resolution interface.

Release v0.9.0 contains experimental GraphQL\Type\LazyResolution as sample implementation, but it requires some workflow in order to actually load types on demand:

  1. Dump schema descriptor, prodced by EagerResolution strategy (probably during build step in some CLI tool):
$descriptor = $schema->getDescriptor();
file_put_contents($baseDir . '/my-schema-descriptor.php', "<?php\n return " . var_export($descriptor, true));
  1. Set lazy loading for production environment:
$descriptor = include $baseDir . '/my-schema-descriptor.php';
$typeLoader = function($typeName) {
    $className = 'MyNamespace\\' . $typeName;
    return new $className();
};
$lazyTypeResolutionStrategy = new GraphQL\Type\LazyResolution($descriptor, $typeLoader);
$schema = new GraphQL\Schema([
    // ...
    'typeResolution' => $lazyTypeResolutionStrategy
]);

This is experimental feature, there are still no benchmarks to see if it gives major benefits. So until we do have some evidence that it helps - it may be a subject to change.

@vladar vladar closed this as completed Jan 19, 2017
@vladar
Copy link
Member Author

vladar commented Mar 3, 2017

Forgot to post results of my research on why trully lazy type loading is problematic (without external hints about schema structure) and what we will have to sacrifice to implement it:

The most important problem: we must either sacrifice validation of fragment spreads in some cases or introduce type loader (similar to classloader in PHP). Consider this schema:

interface Pet {}
type Dog implements Pet {}
type Query {
  pets: [Pet]
}

And query:

{
  pets {
    .. on Dog
    .. on NonExistingType
  }
}

We don't know actual return type of pets field during validation step (as validation is static analysis). And type Dog is not loaded at this moment (as it wasn't mentioned anywhere before in the query) + we have no knowledge on how to load it.

We solved similar problem previously with types option to Schema. But that problem only affected a small portion of types, while this new situation may affect almost any object type.

Options are:

  1. Blindly allow both on Dog and on NonExistingType fragments in such case: execution will just return nothing for most of invalid fragments
  2. Introduce type loader and ask it to load type Dog for us + continue validation. But this is a major breaking change.
  3. Combine two options above with stricter validation when type loader is defined and relaxed validation without type loader.

But even with type loader we will have to give up validating fragment spreads when fragment is of Interface type and it is spread on field or other fragment of interface type:

interface Pet {
  name
}
interface Being {}

fragment on Pet {
   ... on Being
}

Right now if two interfaces have intersecting implementations this query will validate, but if not - it will fail the validation. With lazy approach it will always pass the validation, but then return nothing. I guess this case is extreemly rare.

One more problem I've discovered is that we will have to force resolveType in interfaces and give up on trying to guess actual object type with isTypeOf. Such guessing requires prior knowledge about all possible implementations of interface which is only available if you scan the whole schema upfront.

In general these trade-offs seem a reasonable price, but require major version dump. Also the type loading solution requires some further thinking. Re-opening this just to keep visible for next version.

@vladar vladar reopened this Mar 3, 2017
@andheiberg
Copy link
Contributor

andheiberg commented Jun 19, 2017

One more problem I've discovered is that we will have to force resolveType in interfaces and give up on trying to guess actual object type with isTypeOf. [emphasis added]

Meaning in it's current implementation this has not been done and something is broken?

Could you elaborate on what is broken?

I understand the example with Pet and Being is that the only thing?

This is experimental feature, there are still no benchmarks to see if it gives major benefits. So until we do have some evidence that it helps - it may be a subject to change.

Would you be able to post the benchmarks you've found?

@vladar
Copy link
Member Author

vladar commented Jun 19, 2017

@AndreasHeiberg Current implementation works, but it requires separate build step. Basically you must analyze whole schema and save somewhere all types existing in schema and (most important) all interface implementations. Current solution requires these hints for lazy type loading.

In theory, it is possible to prepare other solution which will load types on deman without separate build step. But such solution will have one restriction: you won't know all interface implementations.

So any code that relies on this knowledge (like isTypeOf or some validation rules) will not work and requires different approaches.

@andheiberg
Copy link
Contributor

Right now if two interfaces have intersecting implementations this query will validate, but if not - it will fail the validation. With lazy approach it will always pass the validation, but then return nothing. I guess this case is extreemly rare.

I just added LazyLoading in a project to gather some benchmarks. I did see a substantial 20% improvement for a small query only touching a tiny fraction of our schema. I was about to test more complicated queries that touch more of our schema only discover what I think is this case?

I must have been optimistic when reading your initial comment. Is what you mean that the following should fail accordingly:

interface Base {
   fieldA
   fieldB
}

type One {
   fieldA
   fieldB
   fieldC
}

type Two {
   fieldA
   fieldB
   fieldD
}

Running a query against this to the effect of:

{
   base {
        _typename
        ... on One {
           fieldC
        }
        ... on Two {
           fieldD
        }
   }
}

Returns:

{
    "data": {
        "base": {
             "__typename": "One"
        }
    }
}

This is not a small edge case nor extremely rare our schema has this all over :( any suggestions for how this could be fixed?

@vladar
Copy link
Member Author

vladar commented Jun 19, 2017

As for performance improvement - it should be quite constant in ms accross queries. Technically it saves time on schema building, so query size is not that important (but the impact will be smaller on bigger queries, simply because they do init more types during execution).

As for the interface issue. This is not the same case as I described. You spread object types on interface field, but the case I described relates to spreading interface type on field of other interface type.

So your example should always work - in both old and new anticipated solution. Given your result - it is either a bug in current solution or some error in your custom code. If you're sure this is bug - can you create reproducible test case and open new issue? (I need to see data and resolvers to figure out what's going on)

@andheiberg
Copy link
Contributor

Right opened #138 PR as I thought it would be more clear for the reduced test case. But yeah seemingly it is a problem with the current implementation and not my custom setup.

Unless you see an issue with my reduced test case?

@olragon
Copy link

olragon commented Jul 7, 2017

In my case, I can see 309% performance improved by using LazyResolution.

In screenshots below you can see: time in ms

  • schemaBuildTime: is the time it take to complete new GraphQLSchema([])

Before LazyResolution
before_lazy

After LazyResolution
after_lazy

@vladar
Copy link
Member Author

vladar commented Jul 10, 2017

@olragon Thanks for sharing this! Performance improvement is proportional to schema size (number of types and fields).

I am not sure, but having fields in all types defined as closures may also affect performance in positive way (but I guess you already do so).

@vladar
Copy link
Member Author

vladar commented Aug 16, 2017

Finally, the new implementation doesn't require a separate build step. Only type loader passed directly to schema. Will be released soon in v0.10.0.

@vladar vladar closed this as completed Aug 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants