🚀

【AWS】Amazon Personalize + OpenSearch の連携

2024/07/22に公開

公式ドキュメントの Getting Started で扱うデータセットを利用し、映画の検索結果をパーソナライズした結果でリランキングしてみます。
前回:https://zenn.dev/pupumaru/articles/c605470be3bc9e

OpenSearch Ingestion で CSV 取り込み

Bulk API で取り込んでも良いのですが、折角なので OpenSearch Ingestion(Data Prepper)を利用して movies.csv を取り込んでおく。

version: "2"
csv-s3-pipeline:
  source:
    s3:
      notification_type: "sqs"
      codec:
        csv:
      compression: none
      sqs:
        queue_url: "https://sqs.us-west-2.amazonaws.com/123456789012/ingestion-test-queue"
      aws:
        region: "us-west-2"
        sts_role_arn: "arn:aws:iam::123456789012:role/OSSPipelineRole"
  sink:
    - opensearch:
        # Provide an AWS OpenSearch Service domain endpoint
        hosts: [ "https://xxxx.us-west-2.es.amazonaws.com" ]
        aws:
          # Provide a Role ARN with access to the domain. This role should have a trust relationship with osis-pipelines.amazonaws.com
          sts_role_arn: "arn:aws:iam::123456789012:role/OSSPipelineRole"
          # Provide the region of the domain.
          region: "us-west-2"
          serverless: false
        index: "movies"

Amazon_Personalize_Search_Ranking_Plugin の関連付け

https://docs.aws.amazon.com/ja_jp/personalize/latest/dg/open-search-install-managed.html

プラグインの設定(検索パイプライン作成)

以下を参考に OpenSearch が Assume できる IAM ロールを作成しておく。
https://docs.aws.amazon.com/personalize/latest/dg/service-role-managed.html

ドキュメントのコードは若干誤りがあった。

import requests
from requests_auth_aws_sigv4 import AWSSigV4

domain_endpoint = 'https://xxx.us-west-2.es.amazonaws.com'
pipeline_name = 'personalize-pipeline'
url = f'{domain_endpoint}/_search/pipeline/{pipeline_name}'
auth = AWSSigV4('es')

headers = {'Content-Type': 'application/json'}

body = {
  "description": "A pipeline to apply custom re-ranking from Amazon Personalize",
  "response_processors": [
    {
      "personalized_search_ranking" : {
        "campaign_arn" : "arn:aws:personalize:us-west-2:xxx:campaign/getting-started-campaign",
        "item_id_field" : "movieId",
        "recipe" : "aws-personalized-ranking",
        "weight" : "0.3",
        "tag" : "personalize-processor",
        "iam_role_arn": "arn:aws:iam::xxx:role/OpenSearchPersonalizeRole",
        "aws_region": "us-west-2",
        "ignore_failure": True # Not true
      }
    } # Append
  ]
}
try:
    response = requests.put(url, auth=auth, json=body, headers=headers, verify=False)
    print(response.text)
except Exception as e:
    print(f"Error: {e}")

以下が返ってくれば OK

{"acknowledged":true}

https://docs.aws.amazon.com/ja_jp/personalize/latest/dg/opensearch-configuring-plugin.html

import requests
from requests_auth_aws_sigv4 import AWSSigV4

domain_endpoint = 'https://xxx.us-west-2.es.amazonaws.com'
pipeline_name = 'personalize-pipeline'
index = 'movies'
url = f'{domain_endpoint}/{index}/_settings/'
auth = AWSSigV4('es')
headers = {'Content-Type': 'application/json'}
body = {
    "index.search.default_pipeline": f"{pipeline_name}"
}
try:
    response = requests.put(url, auth=auth, json=body, headers=headers)
    print(response.text)
except Exception as e:
    print(f"Error: {e}")

以下が返ってくれば OK

{"acknowledged":true}

OpenSearch Dashboard の DevTools で試してみる。

GET movies/_search
{
    "_source" : ["movieId","title", "genres"],
    "query": {
        "multi_match": {
            "query": "Horror",
            "fields": ["genres"]
        }
    }
}
{
  "took": 38,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 978,
      "relation": "eq"
    },
    "max_score": 3.0824862,
    "hits": [
      {
        "_index": "movies",
        "_id": "vFvlyZABrLcRtTgpUZZE",
        "_score": 3.0824862,
        "_source": {
          "genres": "Horror",
          "movieId": "1258",
          "title": "Shining, The (1980)"
        }
      },
      {
        "_index": "movies",
        "_id": "8lvlyZABrLcRtTgpUZZF",
        "_score": 3.0824862,
        "_source": {
          "genres": "Horror",
          "movieId": "1322",
          "title": "Amityville 1992: It's About Time (1992)"
        }
      },
      {
        "_index": "movies",
        "_id": "9lvlyZABrLcRtTgpUZZF",
        "_score": 3.0824862,
        "_source": {
          "genres": "Horror",
          "movieId": "1326",
          "title": "Amityville II: The Possession (1982)"
        }
      },
      {
        "_index": "movies",
        "_id": "-VvlyZABrLcRtTgpUZZF",
        "_score": 3.0824862,
        "_source": {
          "genres": "Horror",
          "movieId": "1329",
          "title": "Blood for Dracula (Andy Warhol's Dracula) (1974)"
        }
      },
      {
        "_index": "movies",
        "_id": "xVvlyZABrLcRtTgpUZdI",
        "_score": 3.0824862,
        "_source": {
          "genres": "Horror",
          "movieId": "1623",
          "title": "Wishmaster (1997)"
        }
      },
      {
        "_index": "movies",
        "_id": "qFvlyZABrLcRtTgpUZhK",
        "_score": 3.0824862,
        "_source": {
          "genres": "Horror",
          "movieId": "1972",
          "title": "Nightmare on Elm Street 5: The Dream Child, A (1989)"
        }
      },
      {
        "_index": "movies",
        "_id": "tFvlyZABrLcRtTgpUZhK",
        "_score": 3.0824862,
        "_source": {
          "genres": "Horror",
          "movieId": "1984",
          "title": "Halloween III: Season of the Witch (1982)"
        }
      },
      {
        "_index": "movies",
        "_id": "uFvlyZABrLcRtTgpUZhK",
        "_score": 3.0824862,
        "_source": {
          "genres": "Horror",
          "movieId": "1990",
          "title": "Prom Night IV: Deliver Us From Evil (1992)"
        }
      },
      {
        "_index": "movies",
        "_id": "v1vlyZABrLcRtTgpUZpf",
        "_score": 3.0824862,
        "_source": {
          "genres": "Horror",
          "movieId": "2634",
          "title": "Mummy, The (1959)"
        }
      },
      {
        "_index": "movies",
        "_id": "yFvlyZABrLcRtTgpUZpf",
        "_score": 3.0824862,
        "_source": {
          "genres": "Horror",
          "movieId": "2652",
          "title": "Curse of Frankenstein, The (1957)"
        }
      }
    ]
  }
}

パイプラインの有無で検索結果変わらなかったのでパイプラインのメトリクス確認

GET /_nodes/stats/search_pipeline

"failed": 19 になっている。

{
  "_nodes": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "cluster_name": "xxx:personalize-test",
  "nodes": {
    "RK2DqLuHQ06Hf909886-og": {
      "timestamp": 1721391344538,
      "name": "2c39d39b2a3172376168a87d64e3be08",
      "roles": [
        "data",
        "ingest",
        "master",
        "remote_cluster_client"
      ],
      "search_pipeline": {
        "total_request": {
          "count": 0,
          "time_in_millis": 0,
          "current": 0,
          "failed": 0
        },
        "total_response": {
          "count": 19,
          "time_in_millis": 8614,
          "current": 0,
          "failed": 0
        },
        "pipelines": {
          "personalize-pipeline": {
            "request": {
              "count": 0,
              "time_in_millis": 0,
              "current": 0,
              "failed": 0
            },
            "response": {
              "count": 19,
              "time_in_millis": 8614,
              "current": 0,
              "failed": 0
            },
            "request_processors": [],
            "response_processors": [
              {
                "personalized_search_ranking:personalize-processor": {
                  "type": "personalized_search_ranking",
                  "stats": {
                    "count": 19,
                    "time_in_millis": 8606,
                    "current": 0,
                    "failed": 19
                  }
                }
              }
            ]
          }
        }
      }
    }
  }
}

failed の詳細のトレースができない。。
OpenSearch側のエラーログを出力させてもでない。。

CloudTrail Lake で GetPersonalizedRanking を検索したところエラーメッセージとして "This API does not support recipes of type USER_PERSONALIZATION" が出力されていた。

https://docs.aws.amazon.com/ja_jp/personalize/latest/dg/plugin-requirements.html

カスタムレシピである Personalized-Ranking のみを使用できます。このレシピについての詳細は、「Personalized-Ranking レシピ」を参照してください。

OpenSearch の検索結果をリランキングするのでそりゃそうですね。
前回作成したものは User-Personalization だったので使えない。。
https://zenn.dev/articles/c605470be3bc9e/edit

Personalized-Ranking レシピでキャンペーンを作り直して実行。

{
  "took": 115,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 978,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "movies",
        "_id": "vFvlyZABrLcRtTgpUZZE",
        "_score": 1,
        "_source": {
          "genres": "Horror",
          "movieId": "1258",
          "title": "Shining, The (1980)"
        }
      },
      {
        "_index": "movies",
        "_id": "8lvlyZABrLcRtTgpUZZF",
        "_score": 0.55770665,
        "_source": {
          "genres": "Horror",
          "movieId": "1322",
          "title": "Amityville 1992: It's About Time (1992)"
        }
      },
      {
        "_index": "movies",
        "_id": "9lvlyZABrLcRtTgpUZZF",
        "_score": 0.5,
        "_source": {
          "genres": "Horror",
          "movieId": "1326",
          "title": "Amityville II: The Possession (1982)"
        }
      },
      {
        "_index": "movies",
        "_id": "qFvlyZABrLcRtTgpUZhK",
        "_score": 0.43862396,
        "_source": {
          "genres": "Horror",
          "movieId": "1972",
          "title": "Nightmare on Elm Street 5: The Dream Child, A (1989)"
        }
      },
      {
        "_index": "movies",
        "_id": "-VvlyZABrLcRtTgpUZZF",
        "_score": 0.39178258,
        "_source": {
          "genres": "Horror",
          "movieId": "1329",
          "title": "Blood for Dracula (Andy Warhol's Dracula) (1974)"
        }
      },
      {
        "_index": "movies",
        "_id": "xVvlyZABrLcRtTgpUZdI",
        "_score": 0.36543643,
        "_source": {
          "genres": "Horror",
          "movieId": "1623",
          "title": "Wishmaster (1997)"
        }
      },
      {
        "_index": "movies",
        "_id": "uFvlyZABrLcRtTgpUZhK",
        "_score": 0.3500284,
        "_source": {
          "genres": "Horror",
          "movieId": "1990",
          "title": "Prom Night IV: Deliver Us From Evil (1992)"
        }
      },
      {
        "_index": "movies",
        "_id": "tFvlyZABrLcRtTgpUZhK",
        "_score": 0.33333334,
        "_source": {
          "genres": "Horror",
          "movieId": "1984",
          "title": "Halloween III: Season of the Witch (1982)"
        }
      },
      {
        "_index": "movies",
        "_id": "v1vlyZABrLcRtTgpUZpf",
        "_score": 0.31758314,
        "_source": {
          "genres": "Horror",
          "movieId": "2634",
          "title": "Mummy, The (1959)"
        }
      },
      {
        "_index": "movies",
        "_id": "yFvlyZABrLcRtTgpUZpf",
        "_score": 0.28906482,
        "_source": {
          "genres": "Horror",
          "movieId": "2652",
          "title": "Curse of Frankenstein, The (1957)"
        }
      }
    ]
  },
  "profile": {
    "shards": []
  }
}

実行結果を比較すると順序が入れ替わっているのが分かる。

--- result.json	2024-07-22 12:20:24
+++ result_rerank.json	2024-07-22 12:20:30
@@ -1,5 +1,5 @@
 {
-  "took": 38,
+  "took": 115,
   "timed_out": false,
   "_shards": {
     "total": 5,
@@ -12,12 +12,12 @@
       "value": 978,
       "relation": "eq"
     },
-    "max_score": 3.0824862,
+    "max_score": 1,
     "hits": [
       {
         "_index": "movies",
         "_id": "vFvlyZABrLcRtTgpUZZE",
-        "_score": 3.0824862,
+        "_score": 1,
         "_source": {
           "genres": "Horror",
           "movieId": "1258",
@@ -27,7 +27,7 @@
       {
         "_index": "movies",
         "_id": "8lvlyZABrLcRtTgpUZZF",
-        "_score": 3.0824862,
+        "_score": 0.55770665,
         "_source": {
           "genres": "Horror",
           "movieId": "1322",
@@ -37,7 +37,7 @@
       {
         "_index": "movies",
         "_id": "9lvlyZABrLcRtTgpUZZF",
-        "_score": 3.0824862,
+        "_score": 0.5,
         "_source": {
           "genres": "Horror",
           "movieId": "1326",
@@ -46,8 +46,18 @@
       },
       {
         "_index": "movies",
+        "_id": "qFvlyZABrLcRtTgpUZhK",
+        "_score": 0.43862396,
+        "_source": {
+          "genres": "Horror",
+          "movieId": "1972",
+          "title": "Nightmare on Elm Street 5: The Dream Child, A (1989)"
+        }
+      },
+      {
+        "_index": "movies",
         "_id": "-VvlyZABrLcRtTgpUZZF",
-        "_score": 3.0824862,
+        "_score": 0.39178258,
         "_source": {
           "genres": "Horror",
           "movieId": "1329",
@@ -57,7 +67,7 @@
       {
         "_index": "movies",
         "_id": "xVvlyZABrLcRtTgpUZdI",
-        "_score": 3.0824862,
+        "_score": 0.36543643,
         "_source": {
           "genres": "Horror",
           "movieId": "1623",
@@ -66,18 +76,18 @@
       },
       {
         "_index": "movies",
-        "_id": "qFvlyZABrLcRtTgpUZhK",
-        "_score": 3.0824862,
+        "_id": "uFvlyZABrLcRtTgpUZhK",
+        "_score": 0.3500284,
         "_source": {
           "genres": "Horror",
-          "movieId": "1972",
-          "title": "Nightmare on Elm Street 5: The Dream Child, A (1989)"
+          "movieId": "1990",
+          "title": "Prom Night IV: Deliver Us From Evil (1992)"
         }
       },
       {
         "_index": "movies",
         "_id": "tFvlyZABrLcRtTgpUZhK",
-        "_score": 3.0824862,
+        "_score": 0.33333334,
         "_source": {
           "genres": "Horror",
           "movieId": "1984",
@@ -86,18 +96,8 @@
       },
       {
         "_index": "movies",
-        "_id": "uFvlyZABrLcRtTgpUZhK",
-        "_score": 3.0824862,
-        "_source": {
-          "genres": "Horror",
-          "movieId": "1990",
-          "title": "Prom Night IV: Deliver Us From Evil (1992)"
-        }
-      },
-      {
-        "_index": "movies",
         "_id": "v1vlyZABrLcRtTgpUZpf",
-        "_score": 3.0824862,
+        "_score": 0.31758314,
         "_source": {
           "genres": "Horror",
           "movieId": "2634",
@@ -107,7 +107,7 @@
       {
         "_index": "movies",
         "_id": "yFvlyZABrLcRtTgpUZpf",
-        "_score": 3.0824862,
+        "_score": 0.28906482,
         "_source": {
           "genres": "Horror",
           "movieId": "2652",
@@ -115,5 +115,8 @@
         }
       }
     ]
+  },
+  "profile": {
+    "shards": []
   }
 }

Discussion