GraphQL is for silos

GraphQL is booming.

After Github released its GraphQL API it’s clear that many other services and developers are going to try to adopt the technology. It might have been a turning point.

There are many things to like in GraphQL. For example, the typing system, the idea of having, at last, an effective schema mechanism connecting client and server, or the view of a unified data graph you can query. Many of those ideas are not new at all, but if GraphQL is able to finally make them popular and accepted by development teams, I would be very happy.

However, there’s a main design problem with GraphQL that needs to be addressed: GraphQL is for building silos.

This open issue in GraphQL Github’s repository shows the problem clearly.

GraphQL exposes a single data tree, through a single end-point. All your data is captured in that single data space and cannot reference or be referenced from other GraphQL  end-points in a standard way.

Open de-centralised systems don’t exist in the GraphQL world in its current form (which is not a surprise taking into consideration the original authors of the technology).

Of course, most organisations are just building their own public facing silo. Most of them have just a few clients they control directly: a JS web app, mobile apps, etc.
In this context GraphQL might be an attractive solution, specially because the integration between your client and the data layer is more sophisticated than what you can get with the mainstream interpretation of REST as some kind of HTTP+JSON combo.

But even if this is the case in your external facing API, probably in your back-end the landscape looks a lot more like a loosely coupled federation of services trying to work together. In this context, HTTP is still the best glue to tie them together in a unified data layer.

It should be possible to modify GraphQL to solve this issue and make the technology open:

  • Replace (or map in a standard way) IDs by URIs: If I’m going to reference some object in your data graph I need to be able to refer to it in an unique and unambiguous way. Also your identifiers and my identifiers need to coexist in the same identifier space. Relay global object IDs are half-way there.
  • Add namespaces for the types: If you are not alone in the data universe, you might not be the only one to define the ‘User’ type. Even better, we might want to re-use the same ‘User’ type. Extra points if the final identifier for the type is a URI and I can de-reference it to obtain the introspection query result for the type.
  • Add hyperlinks/pointers to the language: I want to hold references to objects in this or other graphs using their IDs/URIs.

With these three changes, and introducing a shared authentication scheme,  a single GraphQL end-point could be broken into many smaller federated micro-GraphQL end-points conforming a single (real) data graph. This graph could also span multiple data sources in an organisation or across organisations. In a sentence, it could be a real alternative for HTTP and REST.

The flip side of all this is that the pieces and technologies to provide the same level of experience GraphQL offers to developers have been available for HTTP as W3C standards for more than a decade. From the foundational components to the latest bits and ideas to bind them together.

It’s sad, but the surge in popularity of GraphQL makes it more clear our failure in the Linked Data and SemWeb communities to offer value and fix real problems for developers.