🔭

Cloud Monitoring のメトリクス (値) をプログラムで取得してみる

2024/11/04に公開

概要

Cloud Monitoring のメトリクス (値) を毎回目視で見ていませんか ??
実はその値はプログラムを使って取得することが出来ます :D

メトリクス (値) をプログラムで取得することで、以下のメリットがあります

  • 正確な値を取得することが出来る
  • 特定の時間の複数のメトリクスを一気に取得するが出来る

もちろん、目視でもある程度は正確に確認することは可能ですが、定期的な作業の場合はプログラムを用いて自動化することで、 トイルの削減 を意識しましょう

典型的なトイルの例

Cloud SQL の CPU 使用率/Memory 使用率を 1 時間毎に定期的にメールをする

みなさんは SRE の原則に従って、トイルの削減をしていきましょう ( 'ω' و( و"♪

Cloud Monitoring とは

Google Cloud が提供するサービスで、アプリケーションやインフラストラクチャの健全性やパフォーマンス、リソース状況を監視するクラウドベースのツールです

https://cloud.google.com/monitoring?hl=en

課題とトイルの把握

  • 目視による特定の時間のみのメトリクスの確認作業は正確性が落ちる
  • 定期的に繰り返してやる作業の場合がある

---> これは SRE でいう トイル になります

「トイルとは、手作業、繰り返される、自動化が可能、戦術的、長期的な価値がない、サービスの成長に比例して増加する、といった特徴を持つ作業です。」


SRE の原則に沿ったトイルの洗い出しとトラッキング

こういった作業はどんどん自動化していくべきです ( 'ω' و( و"♪
その過程で、まずは Cloud Monitoring のメトリクス(値)をプログラムを用いて取得してみましょう!

サンプルコード

早速ですがサンプルコード (Python) とその出力サンプルです

Cloud SQL Instance

  • CPU 使用率

https://github.com/iganari/package-gcp/blob/main/monitoring/_gcloud/sample/sql-cpu.py

出力サンプル
$ python3 sql-cpu.py
======================================================================================
Instance: gc-sample-project:get-metrics-sql-01-source
---------------------
Time: 2024-11-04 02:42:00+09:00
CPU Usage: 3.02 %
---------------------
Time: 2024-11-04 02:41:00+09:00
CPU Usage: 3.02 %
---------------------
Time: 2024-11-04 02:40:00+09:00
CPU Usage: 3.07 %
---------------------
Time: 2024-11-04 02:39:00+09:00
CPU Usage: 3.01 %
======================================================================================
======================================================================================
Instance: gc-sample-project:get-metrics-sql-01-replica
---------------------
Time: 2024-11-04 02:42:00+09:00
CPU Usage: 2.71 %
---------------------
Time: 2024-11-04 02:41:00+09:00
CPU Usage: 2.75 %
---------------------
Time: 2024-11-04 02:40:00+09:00
CPU Usage: 2.74 %
---------------------
Time: 2024-11-04 02:39:00+09:00
CPU Usage: 2.77 %
======================================================================================
======================================================================================
Instance: gc-sample-project:get-metrics-sql-02-source
---------------------
Time: 2024-11-04 02:42:00+09:00
CPU Usage: 1.1 %
---------------------
Time: 2024-11-04 02:41:00+09:00
CPU Usage: 1.05 %
---------------------
Time: 2024-11-04 02:40:00+09:00
CPU Usage: 0.96 %
---------------------
Time: 2024-11-04 02:39:00+09:00
CPU Usage: 1.04 %
======================================================================================
======================================================================================
Instance: gc-sample-project:get-metrics-sql-02-replica
---------------------
Time: 2024-11-04 02:42:00+09:00
CPU Usage: 1.06 %
---------------------
Time: 2024-11-04 02:41:00+09:00
CPU Usage: 1.01 %
---------------------
Time: 2024-11-04 02:40:00+09:00
CPU Usage: 0.86 %
---------------------
Time: 2024-11-04 02:39:00+09:00
CPU Usage: 0.78 %
======================================================================================
  • Memory 使用率

https://github.com/iganari/package-gcp/blob/main/monitoring/_gcloud/sample/sql-memory.py

出力サンプル
$ python3 sql-memory.py
======================================================================================
Instance: gc-sample-project:get-metrics-01-source
---------------------
Time: 2024-11-04 02:43:00+09:00
Memory Usage: 63.2 %
---------------------
Time: 2024-11-04 02:42:00+09:00
Memory Usage: 63.2 %
---------------------
Time: 2024-11-04 02:41:00+09:00
Memory Usage: 63.2 %
---------------------
Time: 2024-11-04 02:40:00+09:00
Memory Usage: 63.19 %
======================================================================================
======================================================================================
Instance: gc-sample-project:get-metrics-01-replica
---------------------
Time: 2024-11-04 02:43:00+09:00
Memory Usage: 11.53 %
---------------------
Time: 2024-11-04 02:42:00+09:00
Memory Usage: 11.53 %
---------------------
Time: 2024-11-04 02:41:00+09:00
Memory Usage: 11.55 %
---------------------
Time: 2024-11-04 02:40:00+09:00
Memory Usage: 11.54 %
======================================================================================
======================================================================================
Instance: gc-sample-project:get-metrics-02-source
---------------------
Time: 2024-11-04 02:43:00+09:00
Memory Usage: 43.29 %
---------------------
Time: 2024-11-04 02:42:00+09:00
Memory Usage: 43.29 %
---------------------
Time: 2024-11-04 02:41:00+09:00
Memory Usage: 43.29 %
---------------------
Time: 2024-11-04 02:40:00+09:00
Memory Usage: 43.29 %
======================================================================================
======================================================================================
Instance: gc-sample-project:get-metrics-02-replica
---------------------
Time: 2024-11-04 02:43:00+09:00
Memory Usage: 50.52 %
---------------------
Time: 2024-11-04 02:42:00+09:00
Memory Usage: 50.52 %
---------------------
Time: 2024-11-04 02:41:00+09:00
Memory Usage: 50.52 %
---------------------
Time: 2024-11-04 02:40:00+09:00
Memory Usage: 50.52 %
======================================================================================

Memorystore for Redis

  • CPU 使用率

https://github.com/iganari/package-gcp/blob/main/monitoring/_gcloud/sample/memorystore-redis-cpu.py

出力サンプル
$ python3 memorystore-redis-cpu.py
======================================================================================
Instance: projects/gc-sample-project/locations/asia-northeast1/instances/get-metrics-redis-01
Node: node-0
---------------------
Time: 2024-11-04 02:50:31.110000+09:00
CPU Usage: 0.0 %
---------------------
Time: 2024-11-04 02:49:31.110000+09:00
CPU Usage: 0.0 %
---------------------
Time: 2024-11-04 02:48:31.110000+09:00
CPU Usage: 0.0 %
======================================================================================
======================================================================================
Instance: projects/gc-sample-project/locations/asia-northeast1/instances/get-metrics-redis-01
Node: node-1
---------------------
Time: 2024-11-04 02:50:31.110000+09:00
CPU Usage: 0.0 %
---------------------
Time: 2024-11-04 02:49:31.110000+09:00
CPU Usage: 0.0 %
---------------------
Time: 2024-11-04 02:48:31.110000+09:00
CPU Usage: 0.0 %
======================================================================================
======================================================================================
Instance: projects/gc-sample-project/locations/asia-northeast1/instances/get-metrics-redis-02
Node: node-0
---------------------
Time: 2024-11-04 02:57:31.110000+09:00
CPU Usage: 0.0 %
---------------------
Time: 2024-11-04 02:56:31.110000+09:00
CPU Usage: 0.0 %
---------------------
Time: 2024-11-04 02:55:31.110000+09:00
CPU Usage: 0.0 %
======================================================================================
======================================================================================
Instance: projects/gc-sample-project/locations/asia-northeast1/instances/get-metrics-redis-02
Node: node-1
---------------------
Time: 2024-11-04 02:57:31.110000+09:00
CPU Usage: 0.0 %
---------------------
Time: 2024-11-04 02:56:31.110000+09:00
CPU Usage: 0.0 %
---------------------
Time: 2024-11-04 02:55:31.110000+09:00
CPU Usage: 0.0 %
======================================================================================
  • Memory 使用率

https://github.com/iganari/package-gcp/blob/main/monitoring/_gcloud/sample/memorystore-redis-memory.py

出力サンプル
$ python3 memorystore-redis-memory.py
======================================================================================
Instance: projects/gc-sample-project/locations/asia-northeast1/instances/get-metrics-redis-01
Node: node-0
---------------------
Time: 2024-11-04 03:02:47.790000+09:00
Memory Usage: 11.04 %
---------------------
Time: 2024-11-04 03:01:47.790000+09:00
Memory Usage: 10.99 %
---------------------
Time: 2024-11-04 03:00:47.790000+09:00
Memory Usage: 10.97 %
======================================================================================
======================================================================================
Instance: projects/gc-sample-project/locations/asia-northeast1/instances/get-metrics-redis-01
Node: node-1
---------------------
Time: 2024-11-04 03:02:47.790000+09:00
Memory Usage: 11.03 %
---------------------
Time: 2024-11-04 03:01:47.790000+09:00
Memory Usage: 10.99 %
---------------------
Time: 2024-11-04 03:00:47.790000+09:00
Memory Usage: 10.96 %
======================================================================================
======================================================================================
Instance: projects/gc-sample-project/locations/asia-northeast1/instances/get-metrics-redis-02
Node: node-0
---------------------
Time: 2024-11-04 03:02:47.790000+09:00
Memory Usage: 11.01 %
---------------------
Time: 2024-11-04 03:01:47.790000+09:00
Memory Usage: 10.99 %
---------------------
Time: 2024-11-04 03:00:47.790000+09:00
Memory Usage: 10.95 %
======================================================================================
======================================================================================
Instance: projects/gc-sample-project/locations/asia-northeast1/instances/get-metrics-redis-02
Node: node-1
---------------------
Time: 2024-11-04 03:02:47.790000+09:00
Memory Usage: 11.02 %
---------------------
Time: 2024-11-04 03:01:47.790000+09:00
Memory Usage: 10.99 %
---------------------
Time: 2024-11-04 03:00:47.790000+09:00
Memory Usage: 10.96 %
======================================================================================

---> プログラム (Python) を使って、Cloud Monitoring のメトリクス (値) を正確に取得することが出来ることが分かりました :D

ネクストアクション

上記のようなプログラムをイベントドリブンな PaaS (Cloud Run Jobs や Cloud Run Functions など) に起き、定期実行できるサービス (Cloud Scheduler) で実行することで、「定期的に正確なメトリクス (値) を取得する」を実現することが出来ます

これでひとつトイルの削除が出来そうですね ( 'ω' و( و"♪

Enjoy SRE && Have fun :)

GitHubで編集を提案

Discussion