📦

Containerd で NRI を試す

2024/12/23に公開

Kubernetes

containerd

nri

tech

先日 12/19 の CNCF ブログに NRI に関する記事が投稿されていました。

NRI は今まで知らなかったのですが面白そうだったので試してみます。

NRI とは

Node Resource Interface (NRI) は Containerd や CRI-O といった OCI コンテナランタイムで使用できる機能となっており、NRI を使うと Pod やコンテナの作成・更新・削除といったライフサイクルイベントを受け取って特定の処理を実行することができます。NRI は containerd の NRI リポジトリで管理されており、コンポーネントや API、仕様などの詳細が記載されています。

containerd では明示的に指定することで前から使用できたようですが、最近リリースされた v2 ではデフォルトで NRI が有効化されるようになっています。対応する PR は https://github.com/containerd/containerd/pull/9744

NRI では各機能は比較的小規模の go コードで実装された プラグイン という括りで記述されていて、プライグイン毎に適用するかどうかを切り替えることができます。CNCF の記事では NRI リポジトリに存在するサンプルの WASM プライグインを使ったチュートリアルになっています。

使ってみる

https://github.com/containerd/nri/tree/main/plugins にはいくつかサンプルのプラグインが用意されているため、containerd これらのプラグインを使ってみてどのようなことができるのか見ていきます。

準備

今回の検証では以下を使用します。

containerd v2

containerd は v2 以上であればデフォルトで NRI が有効化されています。container -v でメジャーバージョンが 2 系であれば ok。

$ containerd -v
containerd github.com/containerd/containerd/v2 v2.0.1 88aa2f531d6c2922003cc7929e51daf1c14caa0a

containerd の設定ファイル /etc/containerd/config.toml では以下のように NRI の設定項目が記載されており、disable = false であれば有効化されています。

/etc/containerd/config.toml

  [plugins.'io.containerd.nri.v1.nri']
    disable = false
    socket_path = '/var/run/nri/nri.sock'
    plugin_path = '/opt/nri/plugins'
    plugin_config_path = '/etc/nri/conf.d'
    plugin_registration_timeout = '5s'
    plugin_request_timeout = '2s'
    disable_connections = false

config.toml 内に NRI 関連の項目がない場合は設定ファイルの内容が古いバージョンになっている可能性があるので以下のコマンドで生成できます。

containerd config default | sudo tee /etc/containerd/config.toml

https://github.com/containerd/containerd/blob/main/docs/NRI.md にも containerd で NRI を使う際の情報が記載されています。

go と tinygo

各プライグインは go コードで実装されているのでビルドするのに go が必要になります。また、サンプルプライグインの一部のビルドに tinygo も必要です。

kubernetes 環境

NRI は pod や container のライフサイクルを検知してアクションを実行するので、pod やコンテナを作成できる環境が必要になります。crictl を使って pod 等を作ってもいいですがここでは k8s を使用します。

サンプルプライグインのビルド

サンプルプライグインをテストするには以下の手順に沿って NRI のリポジトリを clone して、プライグインの go コードからバイナリをビルドします。

git clone https://github.com/containerd/nri
cd nri
make

これにより build/bin 以下に各 plugin に対応するバイナリが生成されます。

$ ls build/bin
device-injector  differ  hook-injector  logger  template  ulimit-adjuster  v010-adapter  wasm

プライグインは containerd の起動時に自動で読み込ませることができますが、その前に手動で実行して動作をテストすることもできます。例えば logger プライグインを手動で実行する場合、既に containerd が起動している状態で以下のように logger バイナリを実行します。-idx はインデックスで適当な 2 桁の数字を指定します。

$ sudo ./build/bin/logger -idx 00

この状態で k8s の pod やコンテナを作成すると、pod やコンテナのライフサイクルイベントを検知して詳細な情報をターミナルに出力します。例えばコンテナ作成時には StartContainer イベントが発生しますが、logger プライグインがこれを検知して以下のような情報が出力されます。

INFO   [0082] StartContainer: pod:
INFO   [0082] StartContainer:    annotations:
INFO   [0082] StartContainer:      io.kubernetes.cri.container-type: sandbox
INFO   [0082] StartContainer:      io.kubernetes.cri.podsandbox.image-name: registry.k8s.io/pause:3.10
INFO   [0082] StartContainer:      io.kubernetes.cri.sandbox-cpu-period: "100000"
INFO   [0082] StartContainer:      io.kubernetes.cri.sandbox-cpu-quota: "20000"
INFO   [0082] StartContainer:      io.kubernetes.cri.sandbox-cpu-shares: "204"
INFO   [0082] StartContainer:      io.kubernetes.cri.sandbox-id: 04cc6fa9be96e280ab0d316e51003c0c98ca286eb34259df776b62356cdec10e
INFO   [0082] StartContainer:      io.kubernetes.cri.sandbox-log-directory: /var/log/pods/default_hook-demo_11d7c1f2-db8f-44e5-b8b3-cbf13945ad13
INFO   [0082] StartContainer:      io.kubernetes.cri.sandbox-memory: "200000000"
INFO   [0082] StartContainer:      io.kubernetes.cri.sandbox-name: hook-demo
INFO   [0082] StartContainer:      io.kubernetes.cri.sandbox-namespace: default
INFO   [0082] StartContainer:      io.kubernetes.cri.sandbox-uid: 11d7c1f2-db8f-44e5-b8b3-cbf13945ad13
INFO   [0082] StartContainer:      kubectl.kubernetes.io/last-applied-configuration: |
INFO   [0082] StartContainer:        {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"hook-demo","namespace":"default"},"spec":{"containers":[{"command":["sh","-c","echo I am a shell $(sleep inf)"],"image":"busybox","imagePullPolicy":"IfNotPresent","name":"shell","resources":{"limits":{"cpu":"100m","memory":"100M"},"requests":{"cpu":"100m","memory":"100M"}}},{"command":["busybox","sh","-c","echo busybox $(sleep inf)"],"image":"busybox","imagePullPolicy":"IfNotPresent","name":"busybox","resources":{"limits":{"cpu":"100m","memory":"100M"},"requests":{"cpu":"100m","memory":"100M"}},"volumeMounts":[{"mountPath":"/usr/local/bin","name":"host-volume"}]}],"terminationGracePeriodSeconds":1,"volumes":[{"hostPath":{"path":"/usr/local/bin","type":"Directory"},"name":"host-volume"}]}}

自動でプラグインを適用する

上記の手順では手動でプラグインを開始しましたが、NRI の plugin_path （デフォルトでは /opt/nri/plugins）にプライグインのバイナリを配置することで containerd 起動時に自動で読み込まれます。例えばプライグインの 1 つ ulimit-adjuster を自動で読み込ませるにはファイル名を [index]-[plugin_name] として配置します。

$ ls -l /opt/nri/plugins/10-ulimit-adjuster
-rwxr-xr-x 1 root root 19060513 Dec 21 11:00 /opt/nri/plugins/10-ulimit-adjuster

containerd を再起動してログを見ると以下のように 10-ulimit-adjuster が読み込まれていることがわかります。

$ sudo journalctl -u containerd --since "1 min ago"

level=info msg="runtime interface starting up..."
level=info msg="starting plugins..."
level=info msg="discovered plugin 10-ulimit-adjuster"
level=info msg="starting pre-installed NRI plugin \"ulimit-adjuster\"..."
level=info msg="plugin \"pre-connected:10-ulimit-adjuster[322927]\" registered as \"10-ulimit-adjuster\""
level=info msg="Synchronizing NRI (plugin) with current runtime state"
level=info msg="synchronizing plugin 10-ulimit-adjuster"
level=info msg="pre-installed NRI plugin \"10-ulimit-adjuster\" synchronization success"
level=info msg="plugin invocation order"
level=info msg="  #1: \"10-ulimit-adjuster\" (pre-connected:10-ulimit-adjuster[322927])"
level=info msg="containerd successfully booted in 0.136526s"

サンプルプライグインを使ってみる

NRI リポジトリにはサンプルプライグインがいくつか用意されていますが、ここでは以下の 2 つの動作を試してみます。

ulimit-adjuster
hook-injector

ulimit-adjuster

ulimit-adjuster ではコンテナ作成時にコンテナ内の環境に ulimit を設定することができます。これを使ってコンテナに ulimit を設定するには、pod マニフェストの annotation に ulimits.nri.containerd.io/container.$CONTAINER_NAME を設定し、各項目の ulimit の値を設定します。
https://github.com/containerd/nri/blob/main/plugins/ulimit-adjuster/sample-ulimit-adjust.yaml にサンプルの pod マニフェストがあるのでこれをそのまま使います。

sample-ulimit-adjust.yaml

  annotations:
    ulimits.nri.containerd.io/container.sleep: |
      - type: memlock
        hard: 987654
        soft: 645321
      - type: RLIMIT_NOFILE
        hard: 4096
        soft: 1024
      - type: nproc
        hard: 9000

起動した pod の log でコンテナ内の ulimit が確認できます。1 つめに表示されるのが soft limit, 2 つめが hard limit になっています。

$ k logs sleep sleep
real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) unlimited
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 15525
max locked memory           (kbytes, -l) 630
max memory size             (kbytes, -m) unlimited
open files                          (-n) 1024
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) 8192
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 0
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited
real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) unlimited
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 15525
max locked memory           (kbytes, -l) 964
max memory size             (kbytes, -m) unlimited
open files                          (-n) 4096
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) unlimited
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 9000
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited

比較のため、annotation を設定していない pod を作って ulimit を見てみます。

real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) unlimited
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 15525
max locked memory           (kbytes, -l) 8192
max memory size             (kbytes, -m) unlimited
open files                          (-n) 1024
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) 8192
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) unlimited
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited
real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) unlimited
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 15525
max locked memory           (kbytes, -l) 8192
max memory size             (kbytes, -m) unlimited
open files                          (-n) 524288
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) unlimited
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) unlimited
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited

差分は以下。

7c7
< max locked memory           (kbytes, -l) 8192
---
> max locked memory           (kbytes, -l) 630
15c15
< max user processes                  (-u) unlimited
---
> max user processes                  (-u) 0
24c24
< max locked memory           (kbytes, -l) 8192
---
> max locked memory           (kbytes, -l) 964
26c26
< open files                          (-n) 524288
---
> open files                          (-n) 4096
32c32
< max user processes                  (-u) unlimited
---
> max user processes                  (-u) 9000
35d34

ulimit を設定した pod では以下のように値を設定していたので、想定通りの値に設定されていることが確認できます。

項目	soft limit	hard limit
max locked memory	630 kB	964 kB
open file	1024 (デフォルト値)	4096
max user process	指定なし	9000

hook-injector

hook-injector は podman の OCI hook を利用し、ライフサイクルイベントに応じてノード側に配置されたスクリプトなどを実行することができます。

hook-injector を利用するにはまず OCI hook configuration のディレクトリに hook configuration ファイルを配置します。OCI Hooks Configuration によるとディレクトリは以下のようなのでここでは /etc/containers/oci/hooks.d に配置します。

/usr/share/containers/oci/hooks.d
/etc/containers/oci/hooks.d

hook configuration ファイルは json 形式で、どの条件で hook を実行するか、実行するスクリプトのパス、スクリプトに渡す引数といった情報を記載します。サポートされているフィールドや schema については https://github.com/containers/podman/blob/8bcc086b1b9d8aa0ef3bb08d37542adf9de26ac5/pkg/hooks/docs/oci-hooks.5.md を参照。

サンプルの always-inject.json では /usr/local/sbin/demo-hook.sh に存在するシェルスクリプトを実行するように記述されています。

always-inject.json

{
    "version": "1.0.0",
    "hook": {
        "path": "/usr/local/sbin/demo-hook.sh",
        "args": ["this", "hook", "is", "always", "injected"],
        "env": [
            "DEMO_HOOK_ALWAYS_INJECTED=true"
        ]
    },
    "when": {
        "always": true
    },
    "stages": ["prestart", "poststop"]
}

次に実行されるシェルスクリプトをノード上の /usr/local/sbin/demo-hook.sh に配置します。内容はフックの実行時刻や渡された引数をノード上の /tmp/demo-hook.log に出力するだけの単純な処理になっています。

/usr/local/sbin/demo-hook.sh

#!/bin/bash

#   Copyright The containerd Authors.

#   Licensed under the Apache License, Version 2.0 (the "License");
#   you may not use this file except in compliance with the License.
#   You may obtain a copy of the License at

#       http://www.apache.org/licenses/LICENSE-2.0

#   Unless required by applicable law or agreed to in writing, software
#   distributed under the License is distributed on an "AS IS" BASIS,
#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#   See the License for the specific language governing permissions and
#   limitations under the License.

#!/bin/sh

LOG=/tmp/demo-hook.log

touch $LOG
echo "========== [pid $$] $(date) ==========" >> $LOG
echo "command: $0 $@" >> $LOG
echo "environment:" >> $LOG
env | sed 's/^/    /g' >> $LOG

pod マニフェストはサンプルの以下を使用します。hook injector では hook 条件等は hook configuration ファイル側で指定するのでマニフェスト側では特別な記述は必要ありません。

sample-hook-inject.yaml

#
# Once this pod is running, you can verify the result by running
#   kubectl exec -c shell hook-demo -- env | grep DEMO_HOOK
#   kubectl exec -c busybox hook-demo -- env | grep DEMO_HOOK
#
apiVersion: v1
kind: Pod
metadata:
  name: hook-demo
spec:
  containers:
  - name: shell
    image: busybox
    imagePullPolicy: IfNotPresent
    command:
      - sh
      - -c
      - echo I am a shell $(sleep inf)
    resources:
      requests:
        cpu: 100m
        memory: '100M'
      limits:
        cpu: 100m
        memory: '100M'
  - name: busybox
    image: busybox
    imagePullPolicy: IfNotPresent
    command:
      - busybox
      - sh
      - -c
      - echo busybox $(sleep inf)
    resources:
      requests:
        cpu: 100m
        memory: '100M'
      limits:
        cpu: 100m
        memory: '100M'
  terminationGracePeriodSeconds: 1

マニフェストを apply すると以下のように hooks がトリガーされたログが出力されます。

$ sudo /opt/nri/bin/hook-injector -idx 10
INFO   [0000] Created plugin 10-hook-injector (hook-injector, handles CreateContainer)
INFO   [0000] watching directories "/usr/share/containers/oci/hooks.d /etc/containers/oci/hooks.d" for new changes
INFO   [0000] Registering plugin 10-hook-injector...
INFO   [0000] Configuring plugin 10-hook-injector for runtime v2/v2.0.1...
INFO   [0000] Started plugin 10-hook-injector...
INFO   [0014] hook-demo/shell: OCI hooks injected
INFO   [0014] hook-demo/busybox: OCI hooks injected

ノード上の /tmp/hook.log を見るとログが出力されています。上記 pod 内にはコンテナが 2 つ定義されているので、各コンテナの start イベントに応じて hook も 2 回分実行されています。

========== [pid 358882] Sun Dec 22 12:47:30 UTC 2024 ==========
command: /usr/local/sbin/demo-hook.sh hook is always injected
environment:
    PWD=/run/containerd/io.containerd.runtime.v2.task/k8s.io/138cfc3b6db2d382118bf48b27db75e1228ef9f35941a219bcfd9dd8ee0b785b
    DEMO_HOOK_ALWAYS_INJECTED=true
    SHLVL=1
    _=/usr/bin/env
========== [pid 358912] Sun Dec 22 12:47:30 UTC 2024 ==========
command: /usr/local/sbin/demo-hook.sh hook is always injected
environment:
    PWD=/run/containerd/io.containerd.runtime.v2.task/k8s.io/0175fb99feae535ccef4b9fa085bb812dfdc332694cf1dc081137bf3791617c5
    DEMO_HOOK_ALWAYS_INJECTED=true
    SHLVL=1

このように hook-injector はライフサイクルイベントに応じて任意のスクリプトを実行できるので、pod やコンテナが開始、終了する際に何らかの処理を行いたい場合に活用できます。

ちなみに実行するスクリプトはノード上で実行可能なファイルであれば良いので、以下のように python スクリプト等を指定することも可能。

python.json

{
    "version": "1.0.0",
    "hook": {
        "path": "/usr/local/bin/main.py",
        "args": ["test", "test2"]
    },
    "when": {
        "always": true
    },
    "stages": ["prestart", "poststop"]
}

/usr/local/bin/main.py

#!/usr/bin/env python3
import sys
import logging

logging.basicConfig(
        level=logging.INFO,
        format='%(asctime)s - %(levelname)s - %(message)s',
        handlers=[
            logging.FileHandler("/tmp/hook-py.log", mode='a', encoding='utf-8'),
            logging.StreamHandler()
        ]
    )

def main():
    arg0 = sys.argv[0]
    arg1 = sys.argv[1]

    logging.info(f"{arg0} {arg1}")

if __name__ == "__main__":
    main()

/tmp/hook-py.log

2024-12-22 13:37:48,265 - INFO - /usr/local/bin/main.py test2

何故か 1 つ目の args がうまく受け渡されていませんが一応上記のように実行できます。

自作プラグインを作成する

サンプルプライグインはいずれも 1 ~ 複数程度に閉じた go コードで記述されているので、これらを参考にすることで自作のプライグインを作成することができます。ここでは自作のプライグインの書き方の感覚を掴むために、サンプルプライグインの中でも単純な logger プライグインの実装 nrt-logger.go を参考に単純なプライグインを作ってみます。

コードを見ると RunPodSandbox や CreateContainer など pod やコンテナのライフサイクルイベントに応じた関数が実装されているので、これを記述することでライフサイクルイベントをキャッチして処理を実行できることがわかります。

nrt-logger.go

func (p *plugin) RunPodSandbox(_ context.Context, pod *api.PodSandbox) error {
	dump("RunPodSandbox", "pod", pod)
	return nil
}

func (p *plugin) CreateContainer(_ context.Context, pod *api.PodSandbox, container *api.Container) (*api.ContainerAdjustment, []*api.ContainerUpdate, error) {
	dump("CreateContainer", "pod", pod, "container", container)

	adjust := &api.ContainerAdjustment{}

	if cfg.AddAnnotation != "" {
		adjust.AddAnnotation(cfg.AddAnnotation, fmt.Sprintf("logger-pid-%d", os.Getpid()))
	}
	if cfg.SetAnnotation != "" {
		adjust.RemoveAnnotation(cfg.SetAnnotation)
		adjust.AddAnnotation(cfg.SetAnnotation, fmt.Sprintf("logger-pid-%d", os.Getpid()))
	}
	if cfg.AddEnv != "" {
		adjust.AddEnv(cfg.AddEnv, fmt.Sprintf("logger-pid-%d", os.Getpid()))
	}
	if cfg.SetEnv != "" {
		adjust.RemoveEnv(cfg.SetEnv)
		adjust.AddEnv(cfg.SetEnv, fmt.Sprintf("logger-pid-%d", os.Getpid()))
	}

	return adjust, nil, nil
}

各ライフサイクルで関数に受け渡される情報は api.proto (Go としては api.pb.go) で定義されています。注意したいのが関数の引数と返り値で、proto 内で定義されている値に一致させる必要があります。例えば UpdateContainer のイベントに対応するリクエストとレスポンスを見てみると、リクエストでは pod, container, linux_resources の 3 つのデータが受け渡され、レスポンスでは update と evict の 2 つを返すことがわかります。

api.proto

message UpdateContainerRequest {
  // Pod of container being updated.
  PodSandbox pod = 1;
  // Container being updated.
  Container container = 2;
  // Resources to update.
  LinuxResources linux_resources = 3;
}

message UpdateContainerResponse {
  // Requested updates to containers.
  repeated ContainerUpdate update = 1;
  // Requested eviction of containers.
  repeated ContainerEviction evict = 2;
}

そのため、自作の UpdateContainer では上記 3 つの引数と 2 つの返り値を定義する必要があります。ただ関数内では別に参照しなくても問題ありません。

func (p *plugin) UpdateContainer(_ context.Context, pod *api.PodSandbox, container *api.Container, r *api.LinuxResources) ([]*api.ContainerUpdate, error) {
  log.Infof("Update container")
  return nil, nil
}

これが一致しない場合、ビルドしたプライグインを実行する際に以下のようなエラーが出てうまく実行されません。

INFO   [0000] Created plugin 10-simple-logger (simple-logger, handles RunPodSandbox,StopPodSandbox,RemovePodSandbox,CreateContainer)
INFO   [0000] Registering plugin 10-simple-logger...
INFO   [0000] Configuring plugin 10-simple-logger for runtime v2/v2.0.1...
INFO   [0000] got configuration data: "" from runtime v2 v2.0.1
ERROR  [0000] Plugin subscribed for unhandled events PostCreateContainer,StartContainer,PostStartContainer,UpdateContainer,PostUpdateContainer,StopContainer,RemoveContainer (0x7f0)
ERROR  [0000] plugin exited with error internal error: unhandled events PostCreateContainer,StartContainer,PostStartContainer,UpdateContainer,PostUpdateContainer,StopContainer,RemoveContainer (0x7f0)

その他 nrt-logger.go ではメッセージの出力に logrus を使用しています。処理内では独自の dump 関数を使用していますが、dump 内では最終的に logrus の Infof でメッセージを出力しています。

func (p *plugin) RunPodSandbox(_ context.Context, pod *api.PodSandbox) error {
	dump("RunPodSandbox", "pod", pod)
	return nil
}


func dump(args ...interface{}) {
  ....
	log.Infof("%s: %s: failed to dump object: %v", prefix, tag, err)

main 内では実行時引数を定義しています。この部分はおそらくプライグインを直接使用する際の引数のハンドリングに使っているっぽい。

	flag.StringVar(&pluginName, "name", "", "plugin name to register to NRI")
	flag.StringVar(&pluginIdx, "idx", "", "plugin index to register to NRI")
	flag.StringVar(&events, "events", "all", "comma-separated list of events to subscribe for")
	flag.StringVar(&cfg.LogFile, "log-file", "", "logfile name, if logging to a file")
	flag.StringVar(&cfg.AddAnnotation, "add-annotation", "", "add this annotation to containers")
	flag.StringVar(&cfg.SetAnnotation, "set-annotation", "", "set this annotation on containers")
	flag.StringVar(&cfg.AddEnv, "add-env", "", "add this environment variable for containers")
	flag.StringVar(&cfg.SetEnv, "set-env", "", "set this environment variable for containers")
	flag.Parse()

	if cfg.LogFile != "" {
		f, err := os.OpenFile(cfg.LogFile, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
		if err != nil {
			log.Fatalf("failed to open log file %q: %v", cfg.LogFile, err)
		}
		log.SetOutput(f)
	}

	if pluginName != "" {
		opts = append(opts, stub.WithPluginName(pluginName))
	}
	if pluginIdx != "" {
		opts = append(opts, stub.WithPluginIdx(pluginIdx))
	}

	p := &plugin{}
	if p.mask, err = api.ParseEventMask(events); err != nil {
		log.Fatalf("failed to parse events: %v", err)
	}
	cfg.Events = strings.Split(events, ",")

	if p.stub, err = stub.New(p, append(opts, stub.WithOnClose(p.onClose))...); err != nil {
		log.Fatalf("failed to create plugin stub: %v", err)
	}

	err = p.stub.Run(context.Background())
	if err != nil {
		log.Errorf("plugin exited with error %v", err)
		os.Exit(1)
	}
}

pod やコンテナの情報を参照する際は、関数に渡される api.PodSandbox, api.Container の metadata から取得できます。どのような情報が入っているかは api.proto から確認できます。

api.proto

// Pod metadata that is considered relevant for a plugin.
message PodSandbox {
  string id = 1;
  string name = 2;
  string uid = 3;
  string namespace = 4;
  map<string, string> labels = 5;
  map<string, string> annotations = 6;
  string runtime_handler = 7;
  LinuxPodSandbox linux = 8;
  uint32 pid = 9; // for NRI v1 emulation
  repeated string ips = 10;
}

// Container metadata that is considered relevant for a plugin.
message Container {
  string id = 1;
  string pod_sandbox_id = 2;
  string name = 3;
  ContainerState state = 4;
  map<string, string> labels = 5;
  map<string, string> annotations = 6;
  repeated string args = 7;
  repeated string env = 8;
  repeated Mount mounts = 9;
  Hooks hooks = 10;
  LinuxContainer linux = 11;
  uint32 pid = 12; // for NRI v1 emulation
  repeated POSIXRlimit rlimits = 13;
}

pod の name を取得する際は pod.Name とすることで取得できます。

func (p *plugin) RunPodSandbox(ctx context.Context, pod *api.PodSandbox) error {
  log.Infof("Run PodSandbox %s", pod.Name)
  return nil
}

このあたりがわかればとりあえず自作のプライグインをかけるようになります。ここでは単純に各ライフサイクルイベントを受信したらメッセージを表示するだけの簡単なプライグインを書きました。

main.go

package main

import (
        "context"
        "flag"
        "os"
        "strings"

        "github.com/sirupsen/logrus"

        "github.com/containerd/nri/pkg/api"
        "github.com/containerd/nri/pkg/stub"
)

type config struct {
        LogFile       string   `json:"logFile"`
        Events        []string `json:"events"`
        AddAnnotation string   `json:"addAnnotation"`
        SetAnnotation string   `json:"setAnnotation"`
        AddEnv        string   `json:"addEnv"`
        SetEnv        string   `json:"setEnv"`
}

type plugin struct {
        stub stub.Stub
        mask stub.EventMask
}

var (
        cfg config
        log *logrus.Logger
)

func (p *plugin) RunPodSandbox(ctx context.Context, pod *api.PodSandbox) error {
        log.Infof("Run PodSandbox %s", pod.Name)
        return nil
}

func (p *plugin) StopPodSandbox(_ context.Context, pod *api.PodSandbox) error {
        log.Infof("Stop PodSandbox %s", pod.Name)
        return nil
}

func (p *plugin) RemovePodSandbox(_ context.Context, pod *api.PodSandbox) error {
        log.Infof("Remove PodSandbox %s", pod.Name)
        return nil
}

func (p *plugin) CreateContainer(_ context.Context, pod *api.PodSandbox, container *api.Container) (*api.ContainerAdjustment, []*api.ContainerUpdate, error) {
        log.Infof("Create Container %s", container.Name)
        return nil, nil, nil
}

func (p *plugin) PostCreateContainer(_ context.Context, pod *api.PodSandbox, container *api.Container) error {
        log.Infof("Post Create Container %s", container.Name)
        return nil
}

func (p *plugin) StartContainer(_ context.Context, pod *api.PodSandbox, container *api.Container) error {
        log.Infof("Start Container %s", container.Name)
        return nil
}

func (p *plugin) PostStartContainer(_ context.Context, pod *api.PodSandbox, container *api.Container) error {
        log.Infof("Post Start Container %s", container.Name)
        return nil
}

func (p *plugin) UpdateContainer(_ context.Context, pod *api.PodSandbox, container *api.Container, r *api.LinuxResources) ([]*api.ContainerUpdate, error) {
        log.Infof("Update Container %s", container.Name)
        return nil, nil
}

func (p *plugin) PostUpdateContainer(_ context.Context, pod *api.PodSandbox, container *api.Container) error {
        log.Infof("Post Update Container %s", container.Name)
        return nil
}

func (p *plugin) StopContainer(_ context.Context, pod *api.PodSandbox, container *api.Container) ([]*api.ContainerUpdate, error) {
        log.Infof("Stop Container %s", container.Name)
        return nil, nil
}

func (p *plugin) RemoveContainer(_ context.Context, pod *api.PodSandbox, container *api.Container) error {
        log.Infof("Remove Container %s", container.Name)
        return nil
}

func (p *plugin) onClose() {
        os.Exit(0)
}

func main() {
        var (
                pluginName string
                pluginIdx  string
                events     string
                opts       []stub.Option
                err        error
        )

        log = logrus.StandardLogger()
        log.SetFormatter(&logrus.TextFormatter{
                PadLevelText: true,
        })

        flag.StringVar(&pluginName, "name", "", "plugin name to register to NRI")
        flag.StringVar(&pluginIdx, "idx", "", "plugin index to register to NRI")
        flag.StringVar(&events, "events", "all", "comma-separated list of events to subscribe for")
        flag.StringVar(&cfg.LogFile, "log-file", "", "logfile name, if logging to a file")
        flag.StringVar(&cfg.AddAnnotation, "add-annotation", "", "add this annotation to containers")
        flag.StringVar(&cfg.SetAnnotation, "set-annotation", "", "set this annotation on containers")
        flag.StringVar(&cfg.AddEnv, "add-env", "", "add this environment variable for containers")
        flag.StringVar(&cfg.SetEnv, "set-env", "", "set this environment variable for containers")
        flag.Parse()

        if cfg.LogFile != "" {
                f, err := os.OpenFile(cfg.LogFile, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
                if err != nil {
                        log.Fatalf("failed to open log file %q: %v", cfg.LogFile, err)
                }
                log.SetOutput(f)
        }

        if pluginName != "" {
                opts = append(opts, stub.WithPluginName(pluginName))
        }
        if pluginIdx != "" {
                opts = append(opts, stub.WithPluginIdx(pluginIdx))
        }

        p := &plugin{}
        if p.mask, err = api.ParseEventMask(events); err != nil {
                log.Fatalf("failed to parse events: %v", err)
        }
        cfg.Events = strings.Split(events, ",")

        if p.stub, err = stub.New(p, append(opts, stub.WithOnClose(p.onClose))...); err != nil {
                log.Fatalf("failed to create plugin stub: %v", err)
        }

        err = p.stub.Run(context.Background())
        if err != nil {
                log.Errorf("plugin exited with error %v", err)
                os.Exit(1)
        }
}

ビルド

go build -o simple-logger .

他のプライグインと同様に containerd が起動している状態で直接実行することでプライグインを登録できます。

$ sudo ./simple-logger -idx 10
INFO   [0000] Created plugin 10-simple-logger (simple-logger, handles RunPodSandbox,StopPodSandbox,RemovePodSandbox,CreateContainer,PostCreateContainer,StartContainer,PostStartContainer,UpdateContainer,PostUpdateContainer,StopContainer,RemoveContainer)
INFO   [0000] Registering plugin 10-simple-logger...
INFO   [0000] Configuring plugin 10-simple-logger for runtime v2/v2.0.1...
INFO   [0000] Started plugin 10-simple-logger...

試しに hook-injector で使った sample-hook-inject.yaml を apply すると、pod や container の作成に応じて以下のようにメッセージが表示されます。

INFO   [0012] Run PodSandbox hook-demo
INFO   [0013] Create Container shell
INFO   [0013] Post Create Container shell
INFO   [0013] Start Container shell
INFO   [0013] Post Start Container shell
INFO   [0013] Create Container busybox
INFO   [0013] Post Create Container busybox
INFO   [0013] Start Container busybox
INFO   [0013] Post Start Container busybox

k delete -f pod.yml で削除する時のメッセージ

INFO   [0188] Stop Container busybox
INFO   [0188] Stop Container shell
INFO   [0188] Stop PodSandbox hook-demo
INFO   [0188] Remove Container busybox
INFO   [0189] Remove Container shell
INFO   [0229] Remove PodSandbox hook-demo

上記で作成したプラグインは非常にシンプルなものですが、100 行程度の go コードで簡単にプライグインが作成できることがわかります。

おわりに

NRI を軽く触ってみて主に以下のような点がメリットであると感じました。

シングルバイナリとして生成される Go コードを書くことで簡単にプライグインを追加・管理でき、Go で書けるので様々な処理が実装できる。
特定の CRI に依存せず OCI 互換のランタイムでプライグインとして使用できる汎用性がある。
今回の範囲では触れませんでしたが、通信は ttRPC または gRPC で行われるためオーバヘッドが少ない（らしい）

現時点ではまだ v0.9.0 が最新バージョンで安定版の v1 リリースがプロジェクトの目標となっているような状況ですが、今後の拡張が期待されます。