🚀

【Elixir】Dataloaderを導入してGraphQLのN+1問題を解消する

2022/04/27に公開

Elixir

GraphQL

phoenix

tech

Elixir・PhoenixでGraphQLサーバーを実装する際にはAbsintheを利用するのがスタンダード(?)だと思われますが、他の言語での実装同様、何も対策しないとすぐにN+1問題にぶつかります。

この記事ではdataloaderを導入してN+1問題を解消する流れを紹介します。

サンプルプロジェクト

こちらの記事に載せているコードをここに一応置いています。

以下の生成コマンドを叩いてサンプルを進めています。

$ mix phx.new phx-dataloader-sample --app dataloader_sample --no-html --database sqlite3
$ mix phx.gen.context Blog Post posts title:string body:text
$ mix phx.gen.context Blog Comment comments body:text post_id:references:posts

テーブル定義

投稿に対して複数のコメントがついているという単純な構成で進めます。

seedでpostに対して複数のcommentが紐付いている状況をサクッと作っておきましょう。fakerを利用すると便利です。

alias DataloaderSample.Blog.Comment
alias DataloaderSample.Blog.Post

Enum.each(1..5, fn _ ->
  post =
    %Post{title: Faker.Lorem.sentence(), body: Faker.Lorem.paragraph()}
    |> DataloaderSample.Repo.insert!()

  Enum.each(1..5, fn _ ->
    %Comment{body: Faker.Lorem.paragraph(), post_id: post.id} |> DataloaderSample.Repo.insert()
  end)
end)

N+1問題を発生させてみる

まずはN+1が発生してしまう実装を見ていきます。schemaの実装を次のようにしてみました。リゾルバーは phx.gen.context で生成された関数を叩いているのみです。

schema.ex

defmodule DataloaderSampleWeb.Schema do
  use Absinthe.Schema

  alias DataloaderSampleWeb.BlogResolver

  object :post do
    field :id, non_null(:id)
    field :title, non_null(:string)
    field :body, non_null(:string)

    field :comments, non_null(list_of(:comment)) do
      resolve(fn post, _, _ ->
        comments = Ecto.assoc(post, :comments) |> DataloaderSample.Repo.all()
        {:ok, comments}
      end)
    end
  end

  object :comment do
    field :id, non_null(:id)
    field :body, non_null(:string)
  end

  query do
    field :posts, non_null(list_of(non_null(:post))) do
      resolve(&BlogResolver.list_posts/3)
    end
  end
end

blog_resolver.ex

defmodule DataloaderSampleWeb.BlogResolver do
  alias DataloaderSample.Blog

  def list_posts(_, _, _) do
    {:ok, Blog.list_posts()}
  end
end

すべての投稿をコメント付きで取得するqueryを発行してみます。

{
  posts {
    title
    body
    comments {
      id
      body
    }
  }
}

ログを見ると、commentsテーブルに対してpostsのレコード分複数回クエリが発行されていることがわかります。

[debug] QUERY OK source="posts" db=0.0ms idle=673.2ms
SELECT p0."id", p0."body", p0."title", p0."inserted_at", p0."updated_at" FROM "posts" AS p0 []
↳ DataloaderSampleWeb.BlogResolver.list_posts/3, at: lib/dataloader_sample_web/resolvers/blog_resolver.ex#5
[debug] QUERY OK source="comments" db=0.1ms queue=0.1ms idle=707.0ms
SELECT c0."id", c0."body", c0."post_id", c0."inserted_at", c0."updated_at" FROM "comments" AS c0 WHERE (c0."post_id" = ?) [1]
↳ anonymous fn/3 in DataloaderSampleWeb.Schema.__absinthe_function__/2, at: lib/dataloader_sample_web/schema.ex#13
[debug] QUERY OK source="comments" db=0.1ms idle=709.2ms
SELECT c0."id", c0."body", c0."post_id", c0."inserted_at", c0."updated_at" FROM "comments" AS c0 WHERE (c0."post_id" = ?) [2]
↳ anonymous fn/3 in DataloaderSampleWeb.Schema.__absinthe_function__/2, at: lib/dataloader_sample_web/schema.ex#13
[debug] QUERY OK source="comments" db=0.1ms idle=299.9ms
SELECT c0."id", c0."body", c0."post_id", c0."inserted_at", c0."updated_at" FROM "comments" AS c0 WHERE (c0."post_id" = ?) [3]
↳ anonymous fn/3 in DataloaderSampleWeb.Schema.__absinthe_function__/2, at: lib/dataloader_sample_web/schema.ex#13
[debug] QUERY OK source="comments" db=0.1ms idle=280.4ms
SELECT c0."id", c0."body", c0."post_id", c0."inserted_at", c0."updated_at" FROM "comments" AS c0 WHERE (c0."post_id" = ?) [4]
↳ anonymous fn/3 in DataloaderSampleWeb.Schema.__absinthe_function__/2, at: lib/dataloader_sample_web/schema.ex#13
[debug] QUERY OK source="comments" db=0.1ms idle=37.5ms
SELECT c0."id", c0."body", c0."post_id", c0."inserted_at", c0."updated_at" FROM "comments" AS c0 WHERE (c0."post_id" = ?) [5]
↳ anonymous fn/3 in DataloaderSampleWeb.Schema.__absinthe_function__/2, at: lib/dataloader_sample_web/schema.ex#13

この状態で投稿数、コメント数が増えていくとパフォーマンスが悪化してしまいます。

Dataloaderの導入

N+1をわざと起こして見たところで、解消のためにdataloaderライブラリを導入します。

mix.exs

  defp deps do
    [
      # ... 略
      {:dataloader, "~> 1.0.0"}
    ]
  end

$ mix deps.get

depsへの追加が完了したら、schemaファイルを修正します。行うべきことは大きく次の3点です。

:loader というkeyでcontextにDataloader構造体を追加する
Absinthe.Middleware.Dataloader をpluginsに追加する
context.loaderをmiddleware内で利用する
N+1が起きていたresolver部分をAbsinthe.Resolution.Helpers.dataloader/1 で書き換える

schema.ex

defmodule DataloaderSampleWeb.Schema do
  use Absinthe.Schema

  # Helpersをimport。dataloader/1 を呼べるようにする
  import Absinthe.Resolution.Helpers

  alias DataloaderSample.Blog
  alias DataloaderSampleWeb.BlogResolver

  object :post do
    field :id, non_null(:id)
    field :title, non_null(:string)
    field :body, non_null(:string)

    field :comments, non_null(list_of(:comment)) do
      # dataloader/1を実行。Dataloader.add_source/3の第2引数に指定したものと同じ値(source)を指定
      resolve(dataloader(Blog))
    end
  end

  # ... 略

  # context/1を追加（Absinthe.Schemaマクロでdefoverridableに指定されている）
  def context(ctx) do
    loader =
      Dataloader.new()
      |> Dataloader.add_source(Blog, Blog.data())

    Map.put(ctx, :loader, loader)
  end

  # plugins/0を追加（Absinthe.Schemaマクロでdefoverridableに指定されている）
  def plugins() do
    [Absinthe.Middleware.Dataloader] ++ Absinthe.Plugin.defaults()
  end
end

blog.ex

defmodule DataloaderSample.Blog do
  @moduledoc """
  The Blog context.
  """

  import Ecto.Query, warn: false
  alias DataloaderSample.Repo

  alias DataloaderSample.Blog.Post

  # Dataloader.add_source/3 で利用するdata/0を実装
  def data() do
    Dataloader.Ecto.new(DataloaderSample.Repo, query: &query/2)
  end

  def query(queryable, _params) do
    queryable
  end

 # ...略
end

Dataloader.new/0 でloaderを作成している部分では、absintheのhexdocsのお作法に沿って次のように関数を分けています。

Blog コンテキストのmoduleをデータソースとして指定する
- Phoenixアプリケーションではコンテキストごとにsourceを作ると良いとのこと（参考）
- In a Phoenix application you'll generally have one source per context, so that each context can control how its data is loaded.
data/0 を生やして、Dataloader.Ecto.new/2 の結果を返却。Dataloaderが利用するクエリを指定する

datasourceをコンテキストごとに分けていくことでload時に実行されるクエリを拡張していくことが容易になるため、このような分け方を良しとしているのだと思います。

  def query(Post, %{has_admin_rights: true}), do: Post

  def query(Post, _), do: from p in Post, where: is_nil(p.deleted_at)

  def query(queryable, _), do: queryable

また、resolve(dataloader(Blog)) と書くだけで、field名である :comments からPost構造体に生やしている has_many: :comments, Comment を紐づけて解決してくれます。これは気が利いていますね。

https://hexdocs.pm/absinthe/dataloader.html#usage より

In this example, the query returned by query/2 is used as a starting point by Dataloader to build the final query, which it does by traversing schema associations. In other words, Dataloader can determine that an author has many posts, and that to retrieve posts it needs to get those with the relevant author_id. If that's sufficient for your needs, query/2 need not modify the query it's given. But if you only want to load published posts, query/2 can narrow the query accordingly.

  schema "posts" do
   # ... 略
    has_many :comments, Comment
  end

実際の挙動を見てみます。同じqueryを投げてみると、1~5のpost_idが1つのselectにまとめられていることが分かります。これでN+1問題が解消されました！

[debug] QUERY OK source="posts" db=0.1ms queue=0.1ms idle=874.9ms
SELECT p0."id", p0."body", p0."title", p0."inserted_at", p0."updated_at" FROM "posts" AS p0 []
↳ DataloaderSampleWeb.BlogResolver.list_posts/3, at: lib/dataloader_sample_web/resolvers/blog_resolver.ex#5
[debug] QUERY OK source="comments" db=0.3ms idle=923.1ms
SELECT c0."id", c0."body", c0."post_id", c0."inserted_at", c0."updated_at", c0."post_id" FROM "comments" AS c0 WHERE (c0."post_id" IN (?,?,?,?,?)) ORDER BY c0."post_id" [5, 4, 3, 2, 1]
↳ Dataloader.Source.Dataloader.Ecto.run_batch/2, at: lib/dataloader/ecto.ex#680

まとめ

Absintheを使ったGraphQLサーバーの実装にdataloaderを導入してN+1を解消する流れについて簡単なサンプルを紹介しました。脳死で実装するとすぐN+1の壁に当たりがちですので、パフォーマンスの悪化を生まないよううまく付き合いながら実装したいところです。

英語の書籍ですが、以下の書籍のchapter9にdataloaderの記述もあり学べます。興味がある方はぜひ見てみると良いと思います。