🧸
Alibaba CloudのSAEにデプロイしたアプリケーションをモニタリングする

2024/11/20に公開
monitoring
 overview前回、SAEにアプリケーションをデプロイしたので、それのモニタリングについてまとめていく！
前回記事：Alibaba Cloud SAEでwebアプリをコンテナデプロイする
Basic Monitoringのタブがあったのでみてみると、基本的な項目のメトリクスがあった
アラートだしてみたいのでとりあえずベルのマークをクリックしてみると
クラウドモニターに飛んで権限求められたのでOKする

 クラウドモニタリングのアラーム設定まずはアラームの送信先設定をみてみる
Slack Webhookがあるからこれ使う！
アラーム通知サービスのエリアはなぜかシンガポールしか選べなかったけどwebhookのテストも通ったのでこれで登録する
アラーム送信先グループにあるDefault Contact Groupにこの連絡先を追加する
これで連絡先はおっけーかな？

 アラームルール作成続いてアラームのルールを作成する
applicationのcpuが30%超えたらアラーム飛ばすように設定してみた！
ちなみにダッシュボードのクラウド製品モニタリングからリージョンを選んだあとサービスを選ぶとSAEのモニタリングができた
ここのベルマークからもアラートルール作成できるのね

 実際に負荷をかける前回デプロイしたSAEのアプリケーションのコードに素数判定を追加して、これでcpu負荷をかけてみる〜
from fastapi import FastAPI
import math

app = FastAPI()

def cpu_intensive_task(n: int) -> int:
    # 素数判定を使った負荷
    count = 0
    for num in range(2, n):
        is_prime = True
        for i in range(2, int(math.sqrt(num)) + 1):
            if num % i == 0:
                is_prime = False
                break
        if is_prime:
            count += 1
    return count

@app.get("/cpu")
def cpu_load(n: int = 10000):
    result = cpu_intensive_task(n)
    return {"prime_count": result}

@app.get("/")
def read_root():
    return {"Hello": "World!"}
これをSAEにデプロイ！
んで、今回は負荷テストツールとしてk6を使う　お手軽〜
以下スクリプトを用意する
import http from 'k6/http';
import { sleep } from 'k6';

export const options = {
  // A number specifying the number of VUs to run concurrently.
  vus: 30,
  // A string specifying the total duration of the test run.
  duration: '10s',

  // The following section contains configuration options for execution of this
  // test script in Grafana Cloud.
  //
  // See https://grafana.com/docs/grafana-cloud/k6/get-started/run-cloud-tests-from-the-cli/
  // to learn about authoring and running k6 test scripts in Grafana k6 Cloud.
  //
  // cloud: {
  //   // The ID of the project to which the test is assigned in the k6 Cloud UI.
  //   // By default tests are executed in default project.
  //   projectID: "",
  //   // The name of the test in the k6 Cloud UI.
  //   // Test runs with the same name will be grouped.
  //   name: "script.js"
  // },

  // Uncomment this section to enable the use of Browser API in your tests.
  //
  // See https://grafana.com/docs/k6/latest/using-k6-browser/running-browser-tests/ to learn more
  // about using Browser API in your test scripts.
  //
  // scenarios: {
  //   // The scenario name appears in the result summary, tags, and so on.
  //   // You can give the scenario any name, as long as each name in the script is unique.
  //   ui: {
  //     // Executor is a mandatory parameter for browser-based tests.
  //     // Shared iterations in this case tells k6 to reuse VUs to execute iterations.
  //     //
  //     // See https://grafana.com/docs/k6/latest/using-k6/scenarios/executors/ for other executor types.
  //     executor: 'shared-iterations',
  //     options: {
  //       browser: {
  //         // This is a mandatory parameter that instructs k6 to launch and
  //         // connect to a chromium-based browser, and use it to run UI-based
  //         // tests.
  //         type: 'chromium',
  //       },
  //     },
  //   },
  // }
};

// The function that defines VU logic.
//
// See https://grafana.com/docs/k6/latest/examples/get-started-with-k6/ to learn more
// about authoring k6 scripts.
//
export default function() {
  http.get('http://[SAE endpoint]/cpu?n=300000')
  sleep(1);
}

k6 run script.js をいざ実行！

         /\      Grafana   /‾‾/  
    /\  /  \     |\  __   /  /   
   /  \/    \    | |/ /  /   ‾‾\ 
  /          \   |   (  |  (‾)  |
 / __________ \  |_|\_\  \_____/ 

     execution: local
        script: script.js
        output: -

     scenarios: (100.00%) 1 scenario, 30 max VUs, 40s max duration (incl. graceful stop):
              * default: 30 looping VUs for 10s (gracefulStop: 30s)

     data_received..................: 4.7 kB 119 B/s
     data_sent......................: 2.8 kB 70 B/s
     http_req_blocked...............: avg=16.13ms min=17µs  med=16.59ms max=20.56ms  p(90)=19.01ms p(95)=20.5ms  
     http_req_connecting............: avg=15.89ms min=0s    med=16.58ms max=19.67ms  p(90)=18.99ms p(95)=19.61ms 
     http_req_duration..............: avg=31.67s  min=3.25s med=34.71s  max=38.96s   p(90)=38.95s  p(95)=38.96s  
       { expected_response:true }...: avg=31.67s  min=3.25s med=34.71s  max=38.96s   p(90)=38.95s  p(95)=38.96s  
     http_req_failed................: 0.00%  0 out of 31
     http_req_receiving.............: avg=30.65ms min=48µs  med=299µs   max=803.82ms p(90)=7.19ms  p(95)=45.14ms 
     http_req_sending...............: avg=33.03µs min=6µs   med=10µs    max=355µs    p(90)=37µs    p(95)=170.49µs
     http_req_tls_handshaking.......: avg=0s      min=0s    med=0s      max=0s       p(90)=0s      p(95)=0s      
     http_req_waiting...............: avg=31.64s  min=3.25s med=34.71s  max=38.96s   p(90)=38.94s  p(95)=38.96s  
     http_reqs......................: 31     0.775103/s
     iteration_duration.............: avg=32.69s  min=4.27s med=35.71s  max=39.99s   p(90)=39.97s  p(95)=39.98s  
     iterations.....................: 31     0.775103/s
     vus............................: 8      min=8       max=30
     vus_max........................: 30     min=30      max=30

running (40.0s), 00/30 VUs, 31 complete and 0 interrupted iterations
default ✓ [======================================] 30 VUs  10s
実行できた〜〜〜〜〜〜〜〜〜〜〜〜

 負荷テスト結果んでメトリクスを見てみると
負荷ってますね〜〜〜〜〜〜〜〜〜〜〜✌️
しばし待つこと数分。。。設定してあったslackにアラームきた〜〜〜〜〜〜〜〜〜〜〜！
アラーム送信先、メールがマストで飛ぶからslackのみにできるともっと嬉しいかも〜
でも違和感なく使えていいかんじ！
おわり
overview

クラウドモニタリングのアラーム設定

アラームルール作成

実際に負荷をかける

負荷テスト結果

Discussion