🔍

MeiliSearchを使ってみる

2021/10/04に公開

MeiliSearchを最近知ったので、使い勝手などを検証するメモです。

環境

  • OS: Ubuntu 20.04
  • MeiliSearch: 0.22.0

MeiliSearchとは?

meilisearch/MeiliSearch: Powerful, fast, and an easy to use search engine

Rustで実装された全文検索エンジン、という認識。
メイリサーチ という読み方でいいのかな?

各プログラミング言語向けのライブラリが公式で提供されていたりしてすごい。

DockerComposeで起動してみる

とにもかくにもまずはMeiliSearchを立ち上げてみたいと思います。
実際にプロダクトで使うとしたら開発環境ではDockerCompose上で起動したいので、その辺を想定して進めます。

Installation | MeiliSearch Documentation v0.22

公式イメージの確認

docker run -it --rm \
    -p 7700:7700 \
    -v $(pwd)/data.ms:/data.ms \
    getmeili/meilisearch

なるほど getmaili/meilisearch が公式イメージなんですかね。

getmeili/meilisearch - Docker Image | Docker Hub

No overview available

なるほど...
どういうオプションがあるのかがわからない。
じゃあソース読みますかね。

https://github.com/meilisearch/MeiliSearch/blob/81993b6a153aa43a149b76a171a828033b7a5fb6/meilisearch-http/src/option.rs#L17-L129
この辺ですね。
基本的なオプションは環境変数から読み込む方法と起動オプションで渡す方法、両方できるみたいですね。

docker run --rm getmeili/meilisearch ./meilisearch -h
meilisearch-http 0.22.0

USAGE:
    meilisearch [FLAGS] [OPTIONS]

FLAGS:
    -h, --help                            Prints help information
        --ignore-missing-snapshot         The engine will ignore a missing snapshot and not return an error in such case
        --ignore-snapshot-if-db-exists    The engine will skip snapshot importation and not return an error in such case
    -V, --version                         Prints version information

OPTIONS:
        --db-path <db-path>
            The destination where the database must be created [env: MEILI_DB_PATH=]  [default: ./data.ms]

        --dumps-dir <dumps-dir>
            Folder where dumps are created when the dump route is called [env: MEILI_DUMPS_DIR=]  [default: dumps/]

        --env <env>
            This environment variable must be set to `production` if you are running in production. If the server is
            running in development mode more logs will be displayed, and the master key can be avoided which implies
            that there is no security on the updates routes. This is useful to debug when integrating the engine with
            another service [env: MEILI_ENV=]  [default: development]  [possible values: development, production]
        --http-addr <http-addr>
            The address on which the http server will listen [env: MEILI_HTTP_ADDR=0.0.0.0:7700]  [default:
            127.0.0.1:7700]
        --http-payload-size-limit <http-payload-size-limit>
            The maximum size, in bytes, of accepted JSON payloads [env: MEILI_HTTP_PAYLOAD_SIZE_LIMIT=]  [default: 100
            MB]
        --import-dump <import-dump>
            Import a dump from the specified path, must be a `.dump` file

        --import-snapshot <import-snapshot>
            Defines the path of the snapshot file to import. This option will, by default, stop the process if a
            database already exist or if no snapshot exists at the given path. If this option is not specified no
            snapshot is imported
        --log-level <log-level>
            Set the log level [env: MEILI_LOG_LEVEL=]  [default: info]

        --master-key <master-key>
            The master key allowing you to do everything on the server [env: MEILI_MASTER_KEY=]

        --max-index-size <max-index-size>
            The maximum size, in bytes, of the main lmdb database directory [env: MEILI_MAX_INDEX_SIZE=]  [default: 100
            GiB]
        --max-udb-size <max-udb-size>
            The maximum size, in bytes, of the update lmdb database directory [env: MEILI_MAX_UDB_SIZE=]  [default: 100
            GiB]
        --no-analytics <no-analytics>                          Do not send analytics to Meili [env: MEILI_NO_ANALYTICS=]
        --schedule-snapshot <schedule-snapshot>
            Activate snapshot scheduling [env: MEILI_SCHEDULE_SNAPSHOT=]

        --snapshot-dir <snapshot-dir>
            Defines the directory path where meilisearch will create snapshot each snapshot_time_gap [env:
            MEILI_SNAPSHOT_DIR=]  [default: snapshots/]
        --snapshot-interval-sec <snapshot-interval-sec>
            Defines time interval, in seconds, between each snapshot creation [env: MEILI_SNAPSHOT_INTERVAL_SEC=]
            [default: 86400]
        --ssl-auth-path <ssl-auth-path>
            Enable client authentication, and accept certificates signed by those roots provided in CERTFILE [env:
            MEILI_SSL_AUTH_PATH=]
        --ssl-cert-path <ssl-cert-path>
            Read server certificates from CERTFILE. This should contain PEM-format certificates in the right order (the
            first certificate should certify KEYFILE, the last should be a root CA) [env: MEILI_SSL_CERT_PATH=]
        --ssl-key-path <ssl-key-path>
            Read private key from KEYFILE.  This should be a RSA private key or PKCS8-encoded private key, in PEM format
            [env: MEILI_SSL_KEY_PATH=]
        --ssl-ocsp-path <ssl-ocsp-path>
            Read DER-encoded OCSP response from OCSPFILE and staple to certificate. Optional [env: MEILI_SSL_OCSP_PATH=]

        --ssl-require-auth <ssl-require-auth>
            Send a fatal alert if the client does not complete client authentication [env: MEILI_SSL_REQUIRE_AUTH=]

        --ssl-resumption <ssl-resumption>
            SSL support session resumption [env: MEILI_SSL_RESUMPTION=]

        --ssl-tickets <ssl-tickets>                            SSL support tickets [env: MEILI_SSL_TICKETS=]
  • MEILI_MASTER_KEY
    • 多分APIKey的なやつ?
  • MEILI_DB_PATH
    • データを永続化する場所っぽいですね
    • ここにVolumeをマウントしておけば永続化できそう

DockerComposeで起動

docker-compose.yml
version: "3.7"
services:
  meilisearch:
    container_name: meilisearch
    image: getmeili/meilisearch:v0.22.0
    volumes:
      - meili-data:/data.ms
    environment: []
    ports:
      - 7700:7700
volumes:
  meili-data:
    driver: local
docker-compose up
WARNING: Native build is an experimental feature and could change at any time
Creating meilisearch ... done
Attaching to meilisearch
meilisearch    |
meilisearch    | 888b     d888          d8b 888 d8b  .d8888b.                                    888
meilisearch    | 8888b   d8888          Y8P 888 Y8P d88P  Y88b                                   888
meilisearch    | 88888b.d88888              888     Y88b.                                        888
meilisearch    | 888Y88888P888  .d88b.  888 888 888  "Y888b.    .d88b.   8888b.  888d888 .d8888b 88888b.
meilisearch    | 888 Y888P 888 d8P  Y8b 888 888 888     "Y88b. d8P  Y8b     "88b 888P"  d88P"    888 "88b
meilisearch    | 888  Y8P  888 88888888 888 888 888       "888 88888888 .d888888 888    888      888  888
meilisearch    | 888   "   888 Y8b.     888 888 888 Y88b  d88P Y8b.     888  888 888    Y88b.    888  888
meilisearch    | 888       888  "Y8888  888 888 888  "Y8888P"   "Y8888  "Y888888 888     "Y8888P 888  888
meilisearch    |
meilisearch    | Database path:         "./data.ms"
meilisearch    | Server listening on:   "http://0.0.0.0:7700"
meilisearch    | Environment:           "development"
meilisearch    | Commit SHA:            "928930ddd552596f51280a17c143cf38079cdea9"
meilisearch    | Commit date:           "2021-09-13T10:17:10+00:00"
meilisearch    | Package version:       "0.22.0"
meilisearch    |
meilisearch    | Thank you for using MeiliSearch!
meilisearch    |
meilisearch    | We collect anonymized analytics to improve our product and your experience. To learn more, including how to turn off analytics, visit our dedicated documentation page: https://docs.meilisearch.com/learn/what_is_meilisearch/telemetry.html
meilisearch    |
meilisearch    | Anonymous telemetry:   "Enabled"
meilisearch    |
meilisearch    | No master key found; The server will accept unidentified requests. If you need some protection in development mode, please export a key: export MEILI_MASTER_KEY=xxx
meilisearch    |
meilisearch    | Documentation:         https://docs.meilisearch.com
meilisearch    | Source code:           https://github.com/meilisearch/meilisearch
meilisearch    | Contact:               https://docs.meilisearch.com/resources/contact.html or bonjour@meilisearch.com
meilisearch    |
meilisearch    | [2021-10-02T13:41:14Z INFO  actix_server::builder] Starting 24 workers
meilisearch    | [2021-10-02T13:41:14Z INFO  actix_server::builder] Starting "actix-web-service-0.0.0.0:7700" service on 0.0.0.0:7700
curl -v http://localhost:7700
*   Trying 127.0.0.1:7700...
* Connected to localhost (127.0.0.1) port 7700 (#0)
> GET / HTTP/1.1
> Host: localhost:7700
> User-Agent: curl/7.73.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< content-length: 2191
< content-type: text/html
< date: Sat, 02 Oct 2021 13:43:52 GMT
<
<!doctype html><html lang="en"><head><meta charset="utf-8"/><link rel="icon" href="/favicon.ico"/><meta name="viewport" content="width=device-width,initial-scale=1"/><meta name="theme-color" content="#000000"/><meta name="description" content="Dashboard to test MeiliSearch's search engine"/><link rel="apple-touch-icon" href="/logo.png"/><link rel="manifest" href="/manifest.json"/><title>Mini-dashboard | MeiliSearch</title></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div><script>!function(e){function r(r){for(var n,i,a=r[0],l=r[1],f=r[2],c=0,s=[];c<a.length;c++)i=a[c],Object.prototype.hasOwnProperty.call(o,i)&&o[i]&&s.push(o[i][0]),o[i]=0;for(n in l)Object.prototype.hasOwnProperty.call(l,n)&&(e[n]=l[n]);for(p&&p(r);s.length;)s.shift()();return u.push.apply(u,f||[]),t()}function t(){for(var e,r=0;r<u.length;r++){for(var t=u[r],n=!0,a=1;a<t.length;a++){var l=t[a];0!==o[l]&&(n=!1)}n&&(u.splice(r--,1),e=i(i.s=t[0]))}return e}var n={},o={1:0},u=[];function i(r){if(n[r])return n[r].exports;var t=n[r]={i:r,l:!1,exports:{}};return e[r].call(t.exports,t,t.exports,i),t.l=!0,t.exports}i.m=e,i.c=n,i.d=function(e,r,t){i.o(e,r)||Object.defineProperty(e,r,{enumerable:!0,get:t})},i.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},i.t=function(e,r){if(1&r&&(e=i(e)),8&r)return e;if(4&r&&"object"==typeof e&&e&&e.__esModule)return e;var t=Object.create(null);if(i.r(t),Object.defineProperty(t,"default",{enumerable:!0,value:e}),2&r&&"string"!=typeof e)for(var n in e)i.d(t,n,function(r){return e[r]}.bind(null,n));return t},i.n=function(e){var r=e&&e.__esModule?function(){return e.default}:function(){return e};return i.d(r,"a",r),r},i.o=function(e,r){return Object.prototype.hasOwnProperty.call(e,r)},i.p="/";var a=this["webpackJsonpmini-dashboard"]=this["webpackJsonpmini-dashboard"]||[],l=a.push.bind(a);a.push=r,a=a.slice();for(var f=0;f<a.length;f++)r(a[f]);var p=l* Connection #0 to host localhost left intact
;t()}([])</script><script src="/static/js/2.b22bb78b.chunk.js"></script><script src="/static/js/main.81816167.chunk.js"></script></body></html>%

なんかHTMLが返ってきました。

とりあえず起動はできました。

データを入れて試しに検索してみる

Rubyクライアントからデータを入れて、試しに検索してみます。

データ投入

Gemfile
# frozen_string_literal: true

source "https://rubygems.org"

git_source(:github) { |repo_name| "https://github.com/#{repo_name}" }

gem "meilisearch", "~> 0.16.1"
require "meilisearch"

MEILI_URL = "http://localhost:7700"

documents = [
  {
    id: 1,
    title: "Superman",
  },
  {
    id: 2,
    title: "Bat Man",
  },
  {
    id: 3,
    title: "日本の映画",
  },
  {
    id: 4,
    title: "日本映画",
  },
]

res = MeiliSearch::Client.new(MEILI_URL)
  .index("movies")
  .add_documents(documents)

pp res
❯ bundle exec ruby add_documents.rb
{"updateId"=>0}

❯ bundle exec ruby add_documents.rb
{"updateId"=>1}

❯ bundle exec ruby add_documents.rb
{"updateId"=>2}

成功しました。
叩く度に updatedId っていうのがインクリメントされて返ってきますね

検索

require "meilisearch"

MEILI_URL = "http://localhost:7700"

index = MeiliSearch::Client.new(MEILI_URL)
  .index("movies")

puts "-" * 100
pp index.search("man")

puts "-" * 100
pp index.search("映画")
❯ bundle exec ruby search.rb
----------------------------------------------------------------------------------------------------
{"hits"=>[{"id"=>2, "title"=>"Bat Man"}],
 "nbHits"=>1,
 "exhaustiveNbHits"=>false,
 "query"=>"man",
 "limit"=>20,
 "offset"=>0,
 "processingTimeMs"=>0}
----------------------------------------------------------------------------------------------------
{"hits"=>[{"id"=>4, "title"=>"日本映画"}, {"id"=>3, "title"=>"日本の映画"}],
 "nbHits"=>2,
 "exhaustiveNbHits"=>false,
 "query"=>"映画",
 "limit"=>20,
 "offset"=>0,
 "processingTimeMs"=>0}

  • man で検索すると Supermanヒットしない
    • Super man にすると man でもヒットする
    • SuperSuperman はヒットする
  • 映画 で検索すると 日本映画 日本の映画 両方ヒットする

なるほど。

データのfieldを増やしてみる

実際にプロダクトで使う場合は複数のfiledがあるはずなので、いくつかfieldを増やしてみます。
映画なので、出演者とか?監督とか?

require "meilisearch"

MEILI_URL = "http://localhost:7700"

documents = [
  {
    id: 1,
    title: "Super man",
    actors: [
      "山田太郎",   # 区切りなし
      "山田 花子", # 全角スペース区切り
    ],
    director: "田中太郎",
  },
  {
    id: 2,
    title: "Bat Man",
    actors: [
      "山田 太郎",  # 半角スペース区切り
    ],
    director: "鈴木太郎",
  },
  {
    id: 3,
    title: "日本の映画",
    actors: [
      "Johnny Depp",
      "Keanu Charles Reeves",
    ],
    director: "鈴木次郎",
  },
  {
    id: 4,
    title: "日本映画",
    actors: [
      "Brad Pitt",
      "Thomas Cruise Mapother IV"
    ],
    director: "鈴木三郎",
  },
]

res = MeiliSearch::Client.new(MEILI_URL)
  .index("movies")
  .add_documents(documents)

pp res

actors を配列にすることで、配列でデータを入れられるのか確認してみました。

❯ bundle exec ruby add_documents.rb
{"updateId"=>4}

大丈夫ですね。

検索してみます

require "meilisearch"

MEILI_URL = "http://localhost:7700"

@index = MeiliSearch::Client.new(MEILI_URL)
  .index("movies")

def search(query)
  puts "-" * 100
  puts "QUERY: #{query}"
  pp @index.search(query)
end

search("山田")
search("太郎")
search("dep")
❯ bundle exec ruby search.rb
----------------------------------------------------------------------------------------------------
QUERY: 山田
{"hits"=>
  [{"id"=>1,
    "title"=>"Super man",
    "actors"=>["山田太郎", "山田 花子"],
    "director"=>"田中太郎"},
   {"id"=>2, "title"=>"Bat Man", "actors"=>["山田 太郎"], "director"=>"鈴木太郎"}],
 "nbHits"=>2,
 "exhaustiveNbHits"=>false,
 "query"=>"山田",
 "limit"=>20,
 "offset"=>0,
 "processingTimeMs"=>0}
----------------------------------------------------------------------------------------------------
QUERY: 太郎
{"hits"=>
  [{"id"=>1,
    "title"=>"Super man",
    "actors"=>["山田太郎", "山田 花子"],
    "director"=>"田中太郎"},
   {"id"=>2, "title"=>"Bat Man", "actors"=>["山田 太郎"], "director"=>"鈴木太郎"},
   {"id"=>3,
    "title"=>"日本の映画",
    "actors"=>["Johnny Depp", "Keanu Charles Reeves"],
    "director"=>"鈴木次郎"},
   {"id"=>4,
    "title"=>"日本映画",
    "actors"=>["Brad Pitt", "Thomas Cruise Mapother IV"],
    "director"=>"鈴木三郎"}],
 "nbHits"=>4,
 "exhaustiveNbHits"=>false,
 "query"=>"太郎",
 "limit"=>20,
 "offset"=>0,
 "processingTimeMs"=>0}
----------------------------------------------------------------------------------------------------
QUERY: dep
{"hits"=>
  [{"id"=>3,
    "title"=>"日本の映画",
    "actors"=>["Johnny Depp", "Keanu Charles Reeves"],
    "director"=>"鈴木次郎"}],
 "nbHits"=>1,
 "exhaustiveNbHits"=>false,
 "query"=>"dep",
 "limit"=>20,
 "offset"=>0,
 "processingTimeMs"=>0}

ヒット箇所をハイライトしてみる

Attributes to highlight | Search parameters | MeiliSearch Documentation v0.22

...
  pp @index.search(
    query,
    {
      attributesToHighlight: ['*']
    },
  )
...

こんな感じでハイライトできるらしい

----------------------------------------------------------------------------------------------------
QUERY: 山田
{"hits"=>
  [{"id"=>1,
    "title"=>"Super man",
    "actors"=>["山田太郎", "山田 花子"],
    "director"=>"田中太郎",
    "_formatted"=>
     {"id"=>1,
      "title"=>"Super man",
      "actors"=>["<em>山田</em>太郎", "<em>山田</em> 花子"],
      "director"=>"<em>田中</em>太郎"}},
   {"id"=>2,
    "title"=>"Bat Man",
    "actors"=>["山田 太郎"],
    "director"=>"鈴木太郎",
    "_formatted"=>
     {"id"=>2,
      "title"=>"Bat Man",
      "actors"=>["<em>山田</em> 太郎"],
      "director"=>"鈴木太郎"}}],
 "nbHits"=>2,
 "exhaustiveNbHits"=>false,
 "query"=>"山田",
 "limit"=>20,
 "offset"=>0,
 "processingTimeMs"=>0}
----------------------------------------------------------------------------------------------------
QUERY: 太郎
{"hits"=>
  [{"id"=>1,
    "title"=>"Super man",
    "actors"=>["山田太郎", "山田 花子"],
    "director"=>"田中太郎",
    "_formatted"=>
     {"id"=>1,
      "title"=>"Super man",
      "actors"=>["山田<em>太郎</em>", "山田 花子"],
      "director"=>"田中<em>太郎</em>"}},
   {"id"=>2,
    "title"=>"Bat Man",
    "actors"=>["山田 太郎"],
    "director"=>"鈴木太郎",
    "_formatted"=>
     {"id"=>2,
      "title"=>"Bat Man",
      "actors"=>["山田 <em>太郎</em>"],
      "director"=>"鈴木<em>太郎</em>"}},
   {"id"=>3,
    "title"=>"日本の映画",
    "actors"=>["Johnny Depp", "Keanu Charles Reeves"],
    "director"=>"鈴木次郎",
    "_formatted"=>
     {"id"=>3,
      "title"=>"日本の映画",
      "actors"=>["Johnny Depp", "Keanu Charles Reeves"],
      "director"=>"鈴木<em>次郎</em>"}},
   {"id"=>4,
    "title"=>"日本映画",
    "actors"=>["Brad Pitt", "Thomas Cruise Mapother IV"],
    "director"=>"鈴木三郎",
    "_formatted"=>
     {"id"=>4,
      "title"=>"日本映画",
      "actors"=>["Brad Pitt", "Thomas Cruise Mapother IV"],
      "director"=>"鈴木<em>三郎</em>"}}],
 "nbHits"=>4,
 "exhaustiveNbHits"=>false,
 "query"=>"太郎",
 "limit"=>20,
 "offset"=>0,
 "processingTimeMs"=>0}
----------------------------------------------------------------------------------------------------
QUERY: dep
{"hits"=>
  [{"id"=>3,
    "title"=>"日本の映画",
    "actors"=>["Johnny Depp", "Keanu Charles Reeves"],
    "director"=>"鈴木次郎",
    "_formatted"=>
     {"id"=>3,
      "title"=>"日本の映画",
      "actors"=>["Johnny <em>Dep</em>p", "Keanu Charles Reeves"],
      "director"=>"鈴木次郎"}}],
 "nbHits"=>1,
 "exhaustiveNbHits"=>false,
 "query"=>"dep",
 "limit"=>20,
 "offset"=>0,
 "processingTimeMs"=>0}
  • _formatted というfieldに <em>...</em> で囲われて返ってくる
  • やっぱり 太郎 は 三郎 にマッチしているっぽい

特定のFieldで絞り込みたい

Filter というオプションがあるっぽい

https://docs.meilisearch.com/reference/api/filterable_attributes.html#update-filterable-attributes

client.index('movies').update_filterable_attributes([
  'genres',
  'director'
])

filterで使えるfieldをindexにたいして設定しておく必要があるっぽい

これをやらずにfilterオプションを入れるとエラーになる

  pp @index.search(
    query,
    {
      attributesToHighlight: ['*'],
      filter: ["title = '映画'"],
    },
  )
❯ bundle exec ruby search.rb
----------------------------------------------------------------------------------------------------
QUERY: 山田
/home/xxxx/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/gems/meilisearch-0.16.1/lib/meilisearch/http_request.rb:79:in `validate': 400 Bad Request -  --> 1:1 (MeiliSearch::ApiError)
  |
1 | title = '映画'
  | ^---^
  |
  = attribute `title` is not filterable, available filterable attributes are: . See https://docs.meilisearch.com/errors#invalid_filter.

indexを更新

pp index.update_filterable_attributes([
  "title",
  "director",
  "actores",
])
----------------------------------------------------------------------------------------------------
QUERY: 山田
{"hits"=>[],
 "nbHits"=>0,
 "exhaustiveNbHits"=>false,
 "query"=>"山田",
 "limit"=>20,
 "offset"=>0,
 "processingTimeMs"=>0}
----------------------------------------------------------------------------------------------------
QUERY: 太郎
{"hits"=>[],
 "nbHits"=>0,
 "exhaustiveNbHits"=>false,
 "query"=>"太郎",
 "limit"=>20,
 "offset"=>0,
 "processingTimeMs"=>0}
----------------------------------------------------------------------------------------------------
QUERY: dep
{"hits"=>[],
 "nbHits"=>0,
 "exhaustiveNbHits"=>false,
 "query"=>"dep",
 "limit"=>20,
 "offset"=>0,
 "processingTimeMs"=>0}
  • filterは完全一致になるっぽい

searchで検索対象になるfield

https://docs.meilisearch.com/reference/api/searchable_attributes.html#update-searchable-attributes

この辺でSearchableAttributesを設定するらしい

その他のオプション

Search parameters | MeiliSearch Documentation v0.22

いろいろある

大量にデータを入れてベンチマークしてみる

hatoo/oha: Ohayou(おはよう), HTTP load generator, inspired by rakyll/hey with tui animation.

今回は oha で負荷テストしてみます。

条件

  • query: 山田太郎
  • hilghtight: *
  • limit: 100

データ

require "meilisearch"
require "faker"

Faker::Config.locale = "ja"
MEILI_URL = "http://localhost:7700"

def documents(range)
  puts "-" * 100
  puts range
  puts Time.now
  range.map{|i|
    {
      id: i,
      title: Faker::Movie.title,
      actors: 10.times.map { Faker::Name.name },
      director: Faker::Name.name,
    }
  }.tap{
    puts Time.now
  }
end

index = MeiliSearch::Client.new(MEILI_URL)
  .index("movies")
pp index.add_documents(documents(1..10000))

faker-ruby/faker: A library for generating fake data such as names, addresses, and phone numbers.
fakerを使ってテストデータを作ってみました
fakerだとそこまで値が分散しないので、検証としては若干イマイチ感はあるものの、まぁいいかと。

1万件の場合

2021-10-02T14:49:55.492304491Z [2021-10-02T14:49:55Z INFO  actix_web::middleware::logger] 172.18.0.1:44902 "POST /indexes/movies/documents HTTP/1.1" 202 14 "-" "Ruby" 0.034542
2021-10-02T14:50:00.205643481Z [2021-10-02T14:50:00Z INFO  meilisearch_http::index::updates] document addition done: DocumentAdditionResult { nb_documents: 10000 }

速いなぁ。

❯ oha "http://localhost:7700/indexes/movies/search" -d '{"q": "山田太郎", "attributesToHighlight": ["*"], "limit": 100}' -n 10000 -c 50
Summary:
  Success rate: 1.0000
  Total:        1.5734 secs
  Slowest:      0.0374 secs
  Fastest:      0.0045 secs
  Average:      0.0078 secs
  Requests/sec: 6355.7541

  Total data:   19.34 MiB
  Size/request: 1.98 KiB
  Size/sec:     12.29 MiB

Response time histogram:
  0.002 [811]  |■■■■
  0.003 [6032] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.005 [2235] |■■■■■■■■■■■
  0.007 [522]  |■■
  0.009 [212]  |0.010 [86]   |
  0.012 [37]   |
  0.014 [12]   |
  0.016 [4]    |
  0.017 [9]    |
  0.019 [40]   |

Latency distribution:
  10% in 0.0064 secs
  25% in 0.0068 secs
  50% in 0.0074 secs
  75% in 0.0083 secs
  90% in 0.0096 secs
  95% in 0.0110 secs
  99% in 0.0150 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.0002 secs, 0.0001 secs, 0.0006 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0001 secs

Status code distribution:
  [200] 10000 responses

平均: 約 8ms

3〜5ms付近が多い

10万件の場合

❯ oha "http://localhost:7700/indexes/movies/search" -d '{"q": "山田太郎", "attributesToHighlight": ["*"], "limit": 100}' -n 10000 -c 50
Summary:
  Success rate: 1.0000
  Total:        1.5455 secs
  Slowest:      0.0239 secs
  Fastest:      0.0029 secs
  Average:      0.0077 secs
  Requests/sec: 6470.3040

  Total data:   19.04 MiB
  Size/request: 1.95 KiB
  Size/sec:     12.32 MiB

Response time histogram:
  0.002 [11]   |
  0.004 [1591] |■■■■■■■■
  0.006 [6331] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.007 [1605] |■■■■■■■■
  0.009 [327]  |0.011 [78]   |
  0.013 [23]   |
  0.015 [21]   |
  0.017 [7]    |
  0.018 [5]    |
  0.020 [1]    |

Latency distribution:
  10% in 0.0064 secs
  25% in 0.0068 secs
  50% in 0.0074 secs
  75% in 0.0082 secs
  90% in 0.0093 secs
  95% in 0.0101 secs
  99% in 0.0126 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.0004 secs, 0.0001 secs, 0.0014 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0002 secs

Status code distribution:
  [200] 10000 responses

平均: 約 8ms

4〜7ms くらいが多い

1万件の場合とそこまで差はない

100万件の場合

indexはAPI叩いてから非同期でindexingされるようで
100万件なげるとindexがすべて作成されるまでに多少時間がかかる

meilisearch    | [2021-10-02T15:29:08Z INFO  meilisearch_http::index::updates] document addition done: DocumentAdditionResult { nb_documents: 1000 }
...
meilisearch    | [2021-10-02T16:54:50Z INFO  meilisearch_http::index::updates] document addition done: DocumentAdditionResult { nb_documents: 1000 }

1h30m くらいかかったようですね。

❯ oha "http://localhost:7700/indexes/movies/search" -d '{"q": "山田太郎", "attributesToHighlight": ["*"], "limit": 100}' -n 10000 -c 50
Summary:
  Success rate: 1.0000
  Total:        1.7175 secs
  Slowest:      0.0256 secs
  Fastest:      0.0041 secs
  Average:      0.0086 secs
  Requests/sec: 5822.3040

  Total data:   18.92 MiB
  Size/request: 1.94 KiB
  Size/sec:     11.02 MiB

Response time histogram:
  0.002 [124]  |
  0.004 [3815] |■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.006 [4619] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.008 [1162] |■■■■■■■■
  0.010 [194]  |0.012 [49]   |
  0.014 [13]   |
  0.016 [11]   |
  0.018 [4]    |
  0.020 [3]    |
  0.022 [6]    |

Latency distribution:
  10% in 0.0069 secs
  25% in 0.0075 secs
  50% in 0.0083 secs
  75% in 0.0093 secs
  90% in 0.0104 secs
  95% in 0.0112 secs
  99% in 0.0136 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.0011 secs, 0.0002 secs, 0.0021 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0004 secs

Status code distribution:
  [200] 10000 responses

平均: 約 9ms

4〜8ms くらいが多い

パフォーマンス的にはそこまで変わらない。

100万件投入後のメモリ使用量

CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
7f9a069749ee        meilisearch         0.37%               563.9MiB / 62.74GiB   0.88%               321MB / 333MB       578kB / 213GB       39

560MiB 程度

まとめ

  • ElasticSearchよりシンプルで使いやすそう
  • それなりのデータ量になっても軽快に動いてくれそう

TODO

長くなってきたので続きは次回...

  • Railsから使ってみる
  • Rustから使ってみる
  • k8sでの運用(冗長化など)

Discussion