Skip to content

Discussion: how to integrate graphql and sql effectively? #700

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
itaied246 opened this issue Jan 28, 2017 · 7 comments
Closed

Discussion: how to integrate graphql and sql effectively? #700

itaied246 opened this issue Jan 28, 2017 · 7 comments

Comments

@itaied246
Copy link

itaied246 commented Jan 28, 2017

I am writing a web app as a hobby to learn graphql.
I am using a sql database for this, and my orm is sequelize.

graphql solves the problem of multiple client-server queries, and lets us fetch the data in 1 query.

While writing the graphql schema and query objects (using sequelize), I found myself facing the problem of sending too many requests from the server to the db (example).
Searching the web, I found dataloader and join-monster.

  • dataloader was batching some of the requests and managing cache. I found this solution not good enough, since using cache require more ram, and it minimize the number of requests to more than one.
  • join-monster looks like a great solution at first, as it build the db request with only the tables and fields required, and executing only a single request. The problem is that it builds the sql query and execute it as a raw query (sql injection danger) and I couldn't find a way to build deep where condition (issue).

I was searching the web for examples, and all I could find are examples executing inefficient queries, for examples, fetching a user's comments is executed with 2 db round trips (you can think how bad it can get with friends of friends of friends etc...).

I was thinking about implementing custom library to handle it (didn't think it through yet), but I wanted to ask here first, before I go deep into trouble, isn't there something that handle it already?
How production apps solved this problem?
Am I missing something?

@mattecapu
Copy link

I created dataloader-sequelize-wrapper to mitigate the impact on the db. It exploits the information from your Sequelize schema to guarantee no resource is fetched more than once.
Yet it does not optimize as much as something like join-monster.
It would be interesting to port join-monster to use sequelize.

@itaied246
Copy link
Author

Yes indeed, but it surprises me that nobody did it already.
That sounds like a must-have optimization.
How does large scale apps using graphql then?
I'm sure Facebook are optimizing their queries in the most efficient way.

@mattecapu
Copy link

Well, I guess this kind of stuff is highly coupled to one's own db architecture. So you should probably just look for what best fits your needs and build from there.

@wincent
Copy link
Contributor

wincent commented Jan 31, 2017

Thanks for the discussion!

I'm sure Facebook are optimizing their queries in the most efficient way.

Disclaimer: I am a front-end engineer, so my knowledge of how things work on the back-end is vague and imprecise.

My understanding is that Facebook makes this work in two primary ways:

  1. Use of a dataloader-like abstraction. The key aspects here are that it enables fetching in parallel due to the batching behavior, and caching, ensuring no record need be fetched twice.
  2. Use of parallelism, query analysis, caching, and assorted "clever tricks" at the service layer below GraphQL — call this the "ORM" if you will, although Facebook has multiple abstractions at or near this level and it is a big stretch to call it an ORM although it is in some ways analogous to one. The GraphQL query is just a description of what you want to fetch, and the fields get resolved in a relatively thin layer that implements what is described in the GraphQL spec. The real data-fetching engines underlying this are free to implement whatever strategies they want to make this work at scale (and note, these obviously existed before GraphQL did, so we had already invested much effort into make them fast).

Note that multiple services are involved in data-fetching, operating at multiple levels of abstraction. There is no single "database" to be queried, but rather, many machines with different roles are involved in each request. As such, at the level of the GraphQL schema itself, a concept like a "join" or attempting to reduce a GraphQL operation to a single "query" to run on a database, don't really apply in Facebook's infrastructure.

Outside Facebook however, people use all manner of storage engines, so it makes perfect sense for the community to work on projects like join-monster, dataloader-sequelize-wrapper etc, which are designed to get the most out of specific storage architectures in the context of a GraphQL server.

I'm going to close this for now because I don't think there's anything actionable for us to do here, but you are welcome to continue the discussion. Thanks for your contributions so far!

@wincent wincent closed this as completed Jan 31, 2017
@avrono
Copy link

avrono commented Jan 27, 2018

If I may follow on from this, I have inherited a Graphql based API. Whilst it is very neat looking code, the database Postgres and the SQL queries that are generated are some of the most convoluted I have seen. I can't do much caching, as the data is real-time. I am finding it difficult to find ways of breaking these complex SQL queries (which paralyse the DB) into smaller manageable ones.

With a Rest architecture, whilst the code was not nearly as neat, I could design small simple queries called many times, which is much more scalable from an Infrastructure standpoint. Have others experienced the same issues, and how have you solved this ?

@mattecapu
Copy link

@avrono yes, unfortunately it is pretty much the standard situation I think, But the good thing is GraphQL is really easily cacheable, so...

@avrono
Copy link

avrono commented Jan 30, 2018

@mattecapu that's great, however for more real-time applications that becomes a problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants