Skip to content

Lazy loading of types... #425

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
shmax opened this issue Dec 29, 2018 · 9 comments
Closed

Lazy loading of types... #425

shmax opened this issue Dec 29, 2018 · 9 comments

Comments

@shmax
Copy link
Contributor

shmax commented Dec 29, 2018

I have a large number of types (130+) in my code base, and I'm worried about the overhead of building them all for every little request. I followed the advice in your documentation, and made sure that my objects return callbacks for their field properties, and I provided a typeLoader callback when instantiating my Schema. I understand the general concept, but I must be missing a piece, because at least some types are still being constructed even though they are not referenced in any way by the query being made.

After analyzing the sample here it's not clear to me where the savings are. Imagine if MyTypeA had a list of 100 fields on it instead of 1; wouldn't 100 calls to $this->get(<fieldName>) trigger as soon as its fields callback resolves? Or is there some magic I'm missing?

I guess I was expecting something more like this:

private function MyTypeA()
    {
        return new ObjectType([
            'name' => 'MyTypeA',
            'fields' => function() {
                return [
                    'b' => ['type' => 'MyTypeB'] // use a string (or possibly a callable) here, rather than resolve the type on the spot
                ];
            }
        ]);
    }

...with the idea being that I use the name of the type in my field definition rather than a reference to the field itself, and then at runtime it would use the name to look up the type when needed.

I'm pretty sure this is just a matter of me being dumb and not getting something fundamental, so I guess my real question here is whether you think the lazy loading documentation could be improved a bit, or perhaps incorporated into the samples. If you can help me get my questions sorted out I'd be happy to help with either.

Thanks for the wonderful library!

@vladar
Copy link
Member

vladar commented Dec 31, 2018

You are right here. Types are being initiated on fields closure call. So even if you request single field of a type, all types for those fields are being instantiated.

But instance creation is not expensive. It's fields, arguments, and interfaces which add significant overhead. If you don't create them on type instantiation you should get a significant performance win (which we benchmarked).

We actually considered loading types by name, but it didn't pass a cost/benefit analysis at the moment (wins were little, costs to implement were relatively significant).

Anyways, if you do some benchmarks, illustrating the opposite we can get back to this idea. But we really need some numbers to back it up.

@shmax
Copy link
Contributor Author

shmax commented Dec 31, 2018

But instance creation is not expensive. It's fields, arguments, and interfaces which add significant overhead.

Well, I make heavy use of inline schema definitions. All of my mutations, for example, are dynamically generated with custom types (with custom validation types, error codes, and so on), and many of my queries are also dynamically created. It's just a lot of busy work that I would rather not have happen if it doesn't need to. But as you say, I should do some profiling before pressing the matter. I'll see what I can come up with tomorrow.

@shmax
Copy link
Contributor Author

shmax commented Dec 31, 2018

I have some initial data.

First, I profiled my existing code with the following simple query (Faction type has no dependencies on other custom types):

{
	faction (id: 1) {
		id
  }
}

Blackfire provides the following profile for this request:
image

PHP spends about 124 milliseconds in Types::get. (563 milliseconds for the entire query; I have xdebug enabled for all these queries, so things are pretty slow in general overall, but we're doing apples to apples, here, so I didn't bother turning it off).

Next, I doctored my Types::get method to always return the type for faction (I verified that the query still functions as expected after doing this), and get the following profile:

image

The time spent in Types::get drops away to 11 milliseconds (390 milliseconds for the entire query).

I'll post my Types::get implementation here:

	public static function get($name)
	{
		$cacheName = strtolower($name);

//		$name = "faction"; // I uncomment this to simulate a perfect lazy loading scenario


		if (!isset(self::$types[strtolower($cacheName)])) {
			switch($name) {
				default:
					$method = lcfirst($name);
					if(method_exists(get_called_class(), $method)) {
						$type = self::{$method}();
					}
					else
					{
						$typeName = "\\shmax\\api\\types\\{$name}Type";
						$type = new $typeName();
					}

			}

			self::$types[$cacheName] = $type;

		}
		return self::$types[$cacheName];
	}

It's not practical for me to post my entire schema, as it's enormous, but if it would help I could prepare a sample repo that has a schema roughly equivalent in complexity. Or, I could just dive in and start trying to fix the issue, with your guidance.

Please advise, and thanks again!

@justineverson
Copy link

In order for the typeLoader to be called with the type name, isn't the type object already loaded? I'm just trying to wrap my head around why the typeLoader needs to return a type that it should already have.

@vladar
Copy link
Member

vladar commented Apr 21, 2019

Before type loading, we had to scan the whole schema top-to-bottom to discover all possible types. This was required mostly because of the abstract types (interfaces, unions). E.g. resolveType can return type name, not the type instance itself.

Another scenario when type loading is useful is when the schema is defined via SDL. Then we have a type name from the SDL before we actually create type instance.

Technically if there were no SDL and abstract types, we could have avoided type loading at all, but then we had to couple either schema and execution (executor should've somehow registered all types loaded during execution in the schema) or couple schema and each type definition (definition should've notified schema when it has been loaded). Both approaches were adding unnecessary complexity and added restrictions to independent schema usage.

So at the moment, all that type-loading does is helping us to avoid full schema scanning. Plus, it opens ways for other optimizations (one day we may get back to the idea of defining type names as strings #62 for a fully lazy-loaded solution)

@at758
Copy link

at758 commented Jul 26, 2019

There is one more issue with the custom typeLoader. For example - we have a lot of inline type instantiations within our fields. The validation fails because of that cause for inline types you cannot get a single instance from Types.

For eg.

'name' => 'banner',
'type' => new EnumType([
...
])

The problem
When the AST tree is walked the above type creates a new instance and there is no reference for this EnumType.

@duxthefux
Copy link

I'm currently experience either some understanding issue about lazy loading or some implementation issue.

Currently, all types of all my queries and mutations on the schema are resolved, which keeps me awake since days.
From what I understand is, that basically fields' types are always fully resolved, which is already known and might be subject of future optimizations.
But I don't get why all my queries are resolve.

A Query Class looks like

<?php

declare(strict_types=1);

namespace R2\Api\GraphQL\Base\Schema\Object;

use Closure;

class QueryCashbookItems extends \R2\Api\GraphQL\Type\AbstractQueryObjectType
{
    public $attributes = [
        'name' => 'CashbookItems',
        'deprecationReason' => null,
        'description' => null,
    ];

    public function type(): \GraphQL\Type\Definition\Type
    {
        return $this->getPaginationType('CashbookItem'); // => this calls basically triggers resolution of Pagination objects as well, although the QueryCashbookItems-query is never executed
    }

    public function args(): array
    {
        return ['trainingMode' => ['type' => \GraphQL\Type\Definition\Type::nonNull(\GraphQL\Type\Definition\Type::boolean())],
            'page' => ['type' => \GraphQL\Type\Definition\Type::int()],
            'pageSize' => ['type' => \GraphQL\Type\Definition\Type::int()],
            'description' => ['type' => \GraphQL\Type\Definition\Type::string()],
            'externalDocumentId' => ['type' => \GraphQL\Type\Definition\Type::string()],
            'search' => ['type' => \GraphQL\Type\Definition\Type::string()],
        ];
    }

    public function resolve(
        $root,
        array $arguments,
        $context,
        \GraphQL\Type\Definition\ResolveInfo $resolveInfo,
        Closure $getSelectedFields
    ) {
      .... 
    }
}

If a run a query, which DOES NOT involve this class, it is still resolved with all it's dependent types.

Thanks for clarification.

@spawnia
Copy link
Collaborator

spawnia commented Aug 11, 2020

@shmax I am guessing this is resolved as of #557, feel free to reopen if not.

@shmax
Copy link
Contributor Author

shmax commented Aug 11, 2020

@spawnia I was going to close this once all the little fix-up patches and optimizations were sorted out and released, but I suppose there's no harm in doing it now. Thanks much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants