🌲

Docker環境でVald Agentを動かす

2022/09/14に公開

Valdはスケーラビリティと安定性の観点からKubernetes上にValdクラスタ構築をして利用することを推奨していますが、ユースケースに応じてAgent単体をDocker上でも動作させることができます。
Docker DeploymentではKubernetesを利用しないため一部機能制限はありますが、Kubernetes環境を用意しなくともDocker環境のみで気軽に試すことができます。

準備

本チュートリアルを進めるにあたり必要な依存について列挙します。必要に応じて、準備してください。

  • Docker: Docker環境でAgentを動かします
  • Go: サンプルコードを動かす場合に必要です。Goの環境がない方はこちらを参考にインストールしてください。
  • libhdf5: サンプルコードを動かしたい場合に必要になるライブラリです。
    HDF5のインストール
    ### yum
    yum install -y hdf5-devel
    
    ### apt
    apt-get install libhdf5-serial-dev
    
    ### homebrew
    brew install hdf5
    

Vald Agent の Deploy

今回のチュートリアルでは、Coreアルゴリズムとして、NGTを利用します。
また、Deploy時にはlocalにバックアップディレクトリを作成し、docker containerにvolume mountしていきます。

それでは早速Deployしていきましょう。

  1. Vald repositoryのclone
    本チュートリアルでは、Vald repositoryにあるサンプルコードを利用します。

    git clone https://github.com/vdaas/vald.git && \
    cd vald
    
  2. Deployの準備
    Deployする前に、tutorial directoryを作成します。

    cd vald && mkdir -p tutorial && cd tutorial
    

    後のStepでこのdirectoryに設定ファイルやバックアップディレクトリを作成します。

  3. Config ファイルの作成

    Vald AgentをDeployする際に必要となる設定ファイル(config.yaml)を作成します。
    サンプルファイルはこちら

    cat << EOF > config.yaml
    ---
    version: v0.0.0
    time_zone: JST
    logging:
      logger: glg
      level: debug
      format: raw
    server_config:
      servers:
        - name: agent-grpc
          host: 0.0.0.0
          port: 8081
          mode: GRPC
          probe_wait_time: "3s"
          http:
            shutdown_duration: "5s"
            handler_timeout: ""
            idle_timeout: ""
            read_header_timeout: ""
            read_timeout: ""
            write_timeout: ""
      startup_strategy:
        - agent-grpc
      shutdown_strategy:
        - agent-grpc
      full_shutdown_duration: 600s
      tls:
        enabled: false
        # cert: /path/to/cert
        # key: /path/to/key
        # ca: /path/to/ca
    ngt:
      # path to index data
      index_path: "/etc/server/backup"
      # vector dimension
      dimension: 784
      # bulk insert chunk size
      bulk_insert_chunk_size: 10
      # distance_type, which should be "l1", "l2" "angle", "hamming", "cosine", "normalizedangle", "normalizedcosine" or "jaccard"
      distance_type: l2
      # object_type, which should be "float" or "uint8"
      object_type: float
      # creation edge size
      creation_edge_size: 20
      # search edge size
      search_edge_size: 10
      # The limit duration of automatic indexing
      # auto_index_duration_limit should be 30m-6h for production use.
      # Below setting is a just example for this tutorial.
      auto_index_duration_limit: 1m
      # Check duration of automatic indexing.
      # auto_index_check_duration be 10m-1h for production use.
      # Below setting is a just example for this tutorial.
      auto_index_check_duration: 10s
      # The number of cache to trigger automatic indexing
      auto_index_length: 100
      # The limit duration of auto saving indexing
      # auto_save_index_duration should be 30m-60m for production use.
      # Below setting is a just example for this tutorial.
      auto_save_index_duration: 90s
      # The maximum limit duration for an initial delay
      # initial_delay_max_duration should be 3m-5m for production use.
      # Below setting is a just example for this tutorial.
      initial_delay_max_duration: 60s
      # The default create index batch pool size.
      # When it is too large comparing to machine resource, the docker container will be crash.
      default_pool_size: 500
    EOF
    
  4. バックアップディレクトリの作成
    Agentに保存されたindexファイルなどをバックアップするためのディレクトリを作成します。
    このディレクトリをマウントすることでVald Agent containerが意図しない終了をしても、バックアップファイルから復元することができます。

    mkdir -p backup
    
  5. Deploy
    Step.3で作成したconfig.yamlとStep.4で作成したバックアップディレクトリを利用して、Docker上にdeployしてきます。

    docker run -v $(pwd):/etc/server -u "$(id -u $USER):$(id -g $USER)" -v /etc/passwd:/etc/passwd:ro -v /etc/group:/etc/group:ro -p 8081:8081 --rm -it vdaas/vald-agent-ngt
    

    デプロイに成功すると、以下のようなログが出力されます。

    Agent起動時のログ
     2022-09-13 16:08:44     [DEBG]: (github.com/vdaas/vald/internal/runner/runner.go:99):   version info:           {"vald_version":"v1.5.6","server_name":"agent ngt","git_commit":"05d400292635d8ce201d5504c64799c9bc6cf03f","build_time":"2022/07/20_09:41:16+0000","go_version":"1.18.3","go_os":"linux","go_arch":"amd64","go_root":"go","ngt_version":"1.14.7","build_cpu_info_flags":["fpu","vme","de","pse","tsc","msr","pae","mce","cx8","apic","sep","mtrr","pge","mca","cmov","pat","pse36","clflush","mmx","fxsr","sse","sse2","ss","ht","syscall","nx","pdpe1gb","rdtscp","lm","constant_tsc","rep_good","nopl","xtopology","cpuid","pni","pclmulqdq","ssse3","fma","cx16","pcid","sse4_1","sse4_2","movbe","popcnt","aes","xsave","avx","f16c","rdrand","hypervisor","lahf_lm","abm","3dnowprefetch","invpcid_single","pti","fsgsbase","bmi1","hle","avx2","smep","bmi2","erms","invpcid","rtm","mpx","avx512f","avx512dq","rdseed","adx","smap","clflushopt","avx512cd","avx512bw","avx512vl","xsaveopt","xsavec","xsaves","md_clear"],"stack_trace":[{"url":"https://github.com/vdaas/vald/tree/05d400292635d8ce201d5504c64799c9bc6cf03f","function_name":"github.com/vdaas/vald/internal/runner.Do.func1","file":"github.com/vdaas/vald/internal/runner/runner.go","line":101},{"url":"https://github.com/vdaas/vald/tree/05d400292635d8ce201d5504c64799c9bc6cf03f","function_name":"github.com/vdaas/vald/internal/runner.Do","file":"github.com/vdaas/vald/internal/runner/runner.go","line":106},{"url":"https://github.com/vdaas/vald/tree/05d400292635d8ce201d5504c64799c9bc6cf03f","function_name":"main.main.func1","file":"github.com/vdaas/vald/cmd/agent/core/ngt/main.go","line":40},{"url":"https://github.com/vdaas/vald/tree/05d400292635d8ce201d5504c64799c9bc6cf03f","function_name":"github.com/vdaas/vald/internal/safety.recoverFunc.func1","file":"github.com/vdaas/vald/internal/safety/safety.go","line":63},{"url":"https://github.com/vdaas/vald/tree/05d400292635d8ce201d5504c64799c9bc6cf03f","function_name":"main.main","file":"github.com/vdaas/vald/cmd/agent/core/ngt/main.go","line":55}]}
     configuration:          {"version":"v0.0.0","time_zone":"JST","logging":{"logger":"glg","level":"debug","format":"raw"},"server_config":{"servers":[{"name":"agent-grpc","host":"0.0.0.0","port":8081,"mode":"GRPC","probe_wait_time":"3s","http":{"shutdown_duration":"5s","handler_timeout":"","idle_timeout":"","read_header_timeout":"","read_timeout":"","write_timeout":""},"socket_option":{}}],"health_check_servers":null,"metrics_servers":null,"startup_strategy":["agent-grpc"],"shutdown_strategy":["agent-grpc"],"full_shutdown_duration":"600s","tls":{"enabled":false,"cert":"","key":"","ca":"","insecure_skip_verify":false}},"observability":{"enabled":false,"collector":{"duration":"","metrics":{"enable_version_info":false,"version_info_labels":null,"enable_memory":false,"enable_goroutine":false,"enable_cgo":false}},"trace":{"enabled":false,"sampling_rate":0},"prometheus":{"enabled":false,"endpoint":"","namespace":""},"jaeger":{"enabled":false,"collector_endpoint":"","agent_endpoint":"","username":"","password":"","service_name":"","buffer_max_count":0},"stackdriver":{"project_id":"","client":{"api_key":"","audiences":null,"credentials_file":"","credentials_json":"","endpoint":"","quota_project":"","request_reason":"","scopes":null,"user_agent":"","telemetry_enabled":false,"authentication_enabled":false},"exporter":{"monitoring_enabled":false,"tracing_enabled":false,"location":"","bundle_delay_threshold":"","bundle_count_threshold":0,"trace_spans_buffer_max_bytes":0,"metric_prefix":"","skip_cmd":false,"timeout":"","reporting_interval":"","number_of_workers":0},"profiler":{"enabled":false,"service":"","service_version":"","debug_logging":false,"mutex_profiling":false,"cpu_profiling":false,"alloc_profiling":false,"heap_profiling":false,"goroutine_profiling":false,"alloc_force_gc":false,"api_addr":"","instance":"","zone":""}}},"ngt":{"index_path":"/etc/server/backup","dimension":784,"bulk_insert_chunk_size":10,"distance_type":"l2","object_type":"float","creation_edge_size":20,"search_edge_size":10,"auto_index_duration_limit":"1m","auto_index_check_duration":"10s","auto_save_index_duration":"90s","auto_index_length":100,"initial_delay_max_duration":"60s","default_pool_size":500,"vqueue":{},"kvsdb":{}}}
     2022-09-13 16:08:44     [INFO]: (go.uber.org/automaxprocs@v1.5.1/maxprocs/maxprocs.go:47):      maxprocs: Leaving GOMAXPROCS=8: CPU quota undefined
     2022-09-13 16:08:44     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:380):  failed to load vald index from %s        error: %v /etc/server/backup index file does not exists,   path: /etc/server/backup,       err: <nil>: index file not found
     2022-09-13 16:08:44     [WARN]: (github.com/vdaas/vald/internal/core/algorithm/ngt/ngt.go:302): failed to setup option :        github.com/vdaas/vald/internal/core/algorithm/ngt.WithDefaultRadius.func1: invalid option, name: defaultRadius, val: 0
     2022-09-13 16:08:44     [WARN]: (github.com/vdaas/vald/internal/core/algorithm/ngt/ngt.go:302): failed to setup option :        github.com/vdaas/vald/internal/core/algorithm/ngt.WithDefaultEpsilon.func1: invalid option, name: defaultEpsilon, val: 0
     2022-09-13 16:08:44     [WARN]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/vqueue/queue.go:81):  failed to setup option :        github.com/vdaas/vald/pkg/agent/core/ngt/service/vqueue.WithInsertBufferPoolSize.func1: invalid option, name: insertBufferPoolSize, val: 0
     2022-09-13 16:08:44     [WARN]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/vqueue/queue.go:81):  failed to setup option :        github.com/vdaas/vald/pkg/agent/core/ngt/service/vqueue.WithDeleteBufferPoolSize.func1: invalid option, name: deleteBufferPoolSize, val: 0
     2022-09-13 16:08:44     [WARN]: (github.com/vdaas/vald/pkg/agent/core/ngt/handler/grpc/handler.go:66):  failed to setup option :        github.com/vdaas/vald/pkg/agent/core/ngt/handler/grpc.WithStreamConcurrency.func1: invalid option, name: streamConcurrency, val: 0
     2022-09-13 16:08:44     [INFO]: (github.com/vdaas/vald/internal/runner/runner.go:133):  service agent ngt(version: v0.0.0)starting...
     2022-09-13 16:08:44     [INFO]: (github.com/vdaas/vald/internal/runner/runner.go:149):  executing daemon pre-start function
     2022-09-13 16:08:44     [INFO]: (github.com/vdaas/vald/internal/runner/runner.go:155):  executing daemon start function
     2022-09-13 16:08:44     [INFO]: (github.com/vdaas/vald/internal/servers/server/server.go:254):  server agent-grpc executing preStartFunc
     2022-09-13 16:08:44     [DEBG]: (github.com/vdaas/vald/internal/net/control/control.go:102):    controlling socket for tcp6://0.0.0.0:8081, config &control.control{reusePort:false, reuseAddr:false, tcpFastOpen:false, tcpNoDelay:false, tcpCork:false, tcpQuickAck:false, tcpDeferAccept:false, ipTransparent:false, ipRecoverDestinationAddr:false, keepAlive:0}
     2022-09-13 16:08:44     [INFO]: (github.com/vdaas/vald/internal/servers/server/server.go:300):  gRPC server agent-grpc starting on tcp://[::]:8081
     2022-09-13 16:08:44     [DEBG]: (github.com/vdaas/vald/internal/net/grpc/logger/logger.go:55):  [gRPC Log] [core] [Server #1 ListenSocket #2] ListenSocket created
    

サンプルコードによるデモ

最後に作成したサンプルコードによる動作確認を行います。
使用するサンプルコードでは、Fashion-MNISTをデータセットとして利用しています。

  1. データセットのダウンロード

    cd <vald repository path>/example/client/agent && \
    wget http://ann-benchmarks.com/fashion-mnist-784-euclidean.hdf5
    
  2. サンプルコードの実行

    今回はexample/client/agent/main.goを利用します。
    サンプルコードの詳しい解説はファイル内に記載していますので本記事での解説は割愛します。
    indexが完了すると作成したバックアップディレクトリにバックアップファイルが書き込まれています。

    go run main.go
    
    サンプルコード実行時のAgentのログ例
    2022-09-13 16:10:53     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:1188): Exists  uuid: esg97h3ccgan3car8dpg's data not found in kvsdb and insert vqueue      error: ngt uuid esg97h3ccgan3car8dpg's object id not found
    ...
    2022-09-13 16:10:54     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:1188): Exists  uuid: esg97h3ccgan3car8k10's data not found in kvsdb and insert vqueue      error: ngt uuid esg97h3ccgan3car8k10's object id not found
    2022-09-13 16:10:54     [INFO]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:859):  create index operation started, uncommitted indexes = 400
    2022-09-13 16:10:54     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:860):  create index delete phase started
    2022-09-13 16:10:54     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:875):  create index delete phase finished
    2022-09-13 16:10:54     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:877):  create index insert phase started
    2022-09-13 16:10:54     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:911):  create index insert phase finished
    2022-09-13 16:10:54     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:912):  create graph and tree phase started
    2022-09-13 16:10:54     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:913):  pool size = 400
    2022-09-13 16:10:54     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:920):  create graph and tree phase finished
    2022-09-13 16:10:54     [INFO]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:921):  create index operation finished
    2022-09-13 16:10:54     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:1009): cleanup invalid index started
    2022-09-13 16:10:54     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:1011): cleanup invalid index finished
    2022-09-13 16:12:58     [INFO]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:859):  create index operation started, uncommitted indexes = 400
    2022-09-13 16:12:58     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:860):  create index delete phase started
    removeEdgesReliably : Warning! : No edges. ID=400
    2022-09-13 16:12:58     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:875):  create index delete phase finished
    2022-09-13 16:12:58     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:877):  create index insert phase started
    2022-09-13 16:12:58     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:911):  create index insert phase finished
    2022-09-13 16:12:58     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:912):  create graph and tree phase started
    2022-09-13 16:12:58     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:913):  pool size = 500
    2022-09-13 16:12:58     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:920):  create graph and tree phase finished
    2022-09-13 16:12:58     [INFO]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:921):  create index operation finished
    2022-09-13 16:12:58     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:1009): cleanup invalid index started
    2022-09-13 16:12:58     [DEBG]: (github.com/vdaas/vald/pkg/agent/core/ngt/service/ngt.go:1011): cleanup invalid index finished
    
    サンプルコード実行ログ例
    2022-09-14 01:10:53     [INFO]: Start Inserting 400 training vector to Vald Agent
    2022-09-14 01:10:53     [INFO]: Inserted: 10
    ...
    2022-09-14 01:10:54     [INFO]: Finish Inserting 400 training vector to Vald Agent
    2022-09-14 01:10:54     [INFO]: Start Indexing dataset.
    2022-09-14 01:10:54     [INFO]: Finish Indexing dataset.
    2022-09-14 01:10:54     [INFO]: Wait 1m0s for indexing to finish
    2022-09-14 01:11:54     [INFO]: Start searching 20 times
    2022-09-14 01:11:54     [INFO]: 1 - Results : [
     {
      "id": "esg97h3ccgan3car8fh0",
      "distance": 836.1902
     },
     {
      "id": "esg97h3ccgan3car8g0g",
      "distance": 1144.6335
     },
     {
      "id": "esg97h3ccgan3car8i6g",
      "distance": 1268.33
     },
     {
      "id": "esg97h3ccgan3car8jqg",
      "distance": 1433.0809
     },
     {
      "id": "esg97h3ccgan3car8f40",
      "distance": 1440.8862
     },
     {
      "id": "esg97h3ccgan3car8h9g",
      "distance": 1479.168
     },
     {
      "id": "esg97h3ccgan3car8j20",
      "distance": 1547.4369
     },
     {
      "id": "esg97h3ccgan3car8g3g",
      "distance": 1563.3451
     },
     {
      "id": "esg97h3ccgan3car8h80",
      "distance": 1569.5299
     },
     {
      "id": "esg97h3ccgan3car8h60",
      "distance": 1579.6968
     }
    ]
    ...
    
    2022-09-14 01:12:13     [INFO]: 20 - Results : [
     {
      "id": "esg97h3ccgan3car8g6g",
      "distance": 897.2848
     },
     {
      "id": "esg97h3ccgan3car8eqg",
      "distance": 1040.0139
     },
     {
      "id": "esg97h3ccgan3car8i1g",
      "distance": 1244.1543
     },
     {
      "id": "esg97h3ccgan3car8ghg",
      "distance": 1248.1459
     },
     {
      "id": "esg97h3ccgan3car8e20",
      "distance": 1255.1084
     },
     {
      "id": "esg97h3ccgan3car8el0",
      "distance": 1309.401
     },
     {
      "id": "esg97h3ccgan3car8dug",
      "distance": 1391.9343
     },
     {
      "id": "esg97h3ccgan3car8ha0",
      "distance": 1441.8478
     },
     {
      "id": "esg97h3ccgan3car8iq0",
      "distance": 1452.9381
     },
     {
      "id": "esg97h3ccgan3car8gp0",
      "distance": 1490.9209
     }
    ]
    2022-09-14 01:12:14     [INFO]: Start removing vector
    2022-09-14 01:12:14     [INFO]: Removed: 10
    ...
    2022-09-14 01:12:14     [INFO]: Removed: 400
    2022-09-14 01:12:14     [INFO]: Finish removing vector
    2022-09-14 01:12:14     [INFO]: Start removing indexed vector from backup
    2022-09-14 01:12:14     [INFO]: Finish removing indexed vector from backup
    

まとめ

本記事では、Vald Agent単体をdockerで動作させてサンプルコードを実行しました。

今回はサンプルコードでの実行でしたが、前回の投稿のようにchiVeデモを動作させることも可能です。
詳しい使い方は、前回の記事を参考にしてください。

質問等がありましたら、下記Slack WSに投稿してください。
また、ValdへのContributionもいつでもお待ちしています。

https://join.slack.com/t/vald-community/shared_invite/zt-db2ky9o4-R_9p2sVp8xRwztVa8gfnPA

Mediumでblog(英語)を公開しているので、こちらも是非ご覧ください。

https://medium.com/@vdaas-vald

公式X(Twitter)アカウントはこちら

https://twitter.com/vdaas_vald


Valdの関連記事

https://techblog.yahoo.co.jp/entry/2021061430159867/

Discussion