When creating web experiences, an inevitable question is, “how do I get my data from point A (the source) to point B (the component)?”. This can end up being a deceptively complex question.
Gatsby’s rich data plugin ecosystem lets you build sites with the data you want — from one or many sources. You can pull data from headless CMSs, SaaS services, APIs, databases, your file system & more directly into your components.
Most examples in the Gatsby docs and on the web at large focus on leveraging source plugins to manage your data in Gatsby sites. And rightly so! Gatsby’s GraphQL data layer is powerful and extremely effective; it solves the “integration problem” of decoupled CMSs — it’s the glue between presentation layer and wherever your data is sourced from.
Source plugins “source” data from remote or local locations into Gatsby nodes, which are then queryable within your Gatsby site using GraphQL. Gatsby nodes are the center of Gatsby’s data handling layer.
We’re calling this the “content mesh” — the infrastructure layer for a decoupled website. (Sam Bhagwat introduced and explored this concept in his recent five-part series, The Journey to a Content Mesh).
However, you don’t need to use source plugins (or create Gatsby nodes) to pull data into a Gatsby site! In this post we’ll explore how to use Gatsby without GraphQL (using “unstructured data”), and some of the pros and cons of doing so.
Note: For our purposes here, “unstructured data” means data “handled outside of Gatsby’s data layer” i.e. using the data directly, and not transforming the data into Gatsby nodes.
An example of creating pages using unstructured data from a remote API
We’ll take a look at a (very serious) example of how this works. In the example, we’ll:
- Load data from the PokéAPI’s REST endpoints
- Create pages (and nested pages) from this data
That’s it!
The tldr; (in tweet form)
Breaking down the example
Note: This walkthrough assumes you have working knowledge of Gatsby fundamentals. If you’re not (yet!) familiar with Gatsby, you may want to take a look at our Quick Start doc first.
1. Use Gatsby’s createPages
API.
createPages
is a Gatsby Node API. It hooks into a certain point in Gatsby’s bootstrap sequence.
By exporting createPages
from our example Gatsby site’s gatsby-node.js
file, we’re saying, “at this point in the bootstrapping sequence, run this code”.
2. Fetch the data from the PokéAPI.
Note: getPokemonData
is an async function which fetches the relevant desired data for all of our Pokémon.
3. Grab the createPage
action
When you hook into a Gatsby API (like createPages
from step one), you are passed a collection of actions. In this example, we’re extracting the createPage
action using ES6 object destructuring:
4. Create a page that lists all Pokémon.
The createPage
action is passed an object containing:
path
: This is the relative url you’d like your new page will be available at.component
: This is the absolute path to the React component you’ve defined for this page.context
: Context data for this page. Available either as props to the component (this.props.pageContext
) or asgraphql
arguments.
In our example, we’re accessing the context as props to the component. This allows us to completely circumvent Gatsby’s data layer; it’s just props.
5. Create a page for each Pokémon.
6. Create a page for each ability of each Pokémon.
For each type of page, we are invoking the createPage
action, and supplying it with our desired path, React component, and data (as context
).
View the full source code of this example at Jason Lengstorf’s “gatsby-with-unstructured-data” repo. Also check out the “using-gatsby-data-layer” branch of that repo, to compare a refactor that uses Gatsby’s data layer in the same example.
The pros of using unstructured data
- When prototyping, or when new to Gatsby, this approach may feel more familiar, comfortable, and faster
- There’s no intermediate step: you fetch some data, then build pages with it
The tradeoffs of foregoing Gatsby’s data layer
Using Gatsby’s data layer provides the following benefits:
- Enables you to declaratively specify what data a page component needs, alongside the page component
- Eliminates frontend data boilerplate — no need to worry about requesting & waiting for data. Just ask for the data you need with a GraphQL query and it’ll show up when you need it
- Pushes frontend complexity into queries — many data transformations can be done at build-time within your GraphQL queries (e.g. Markdown -> HTML, images -> responsive images, etc)
- It’s the perfect data querying language for the often complex/nested data dependencies of modern applications
- Improves performance by removing data bloat — GraphQL enables you to select only the data you need, not whatever an API returns
- Enables you to take advantage of hot reloading when developing; For example, in this post’s example “Pokémon” site, if you wanted to add a “see other Pokémon” section to the Pokémon detail view, you would need to change your
gatsby-node.js
to pass all Pokémon to the page, and restart the dev server. In contrast, when using queries, you can add a query and it will hot reload.
Learn more about GraphQL in Gatsby.
Working outside of the data layer also means foregoing the optimizations provided by transformer plugins, like:
gatsby-image
(speedy optimized images),gatsby-transformer-sharp
(provides queryable fields for processing your images in a variety of ways including resizing, cropping, and creating responsive images),- … the whole Gatsby ecosystem of official and community-created transformer plugins.
Another difficulty added when working with unstructured data is that your data fetching code becomes increasingly hairy when you source directly from multiple locations.
Links potentially of interest
- GitHub issue: “Choosing not to use the GraphQL feature of Gatsby – a bad idea?”
- Kyle Mathews’ reasoning for going with GraphQL.
- The issue introducing 1.0 GraphQL data layer.
- Gatsby docs on using Gatsby without GraphQL
Thanks
- Thank you to Tanner Linsley of
react-static
, who helped us realize that directly querying APIs and passing them into pages is a great way to build smaller sites, and came up with the term “unstructured data”.