🔥

CloudWatchLogsログイベントをCSVに出力してみた

2024/04/15に公開

本スクリプトについて

ロググループとログストリームを引数として実行するとログイベントをCSV形式で抽出する
実行ログとCSVファイルはカレントディレクトリに出力する

環境確認

~/environment $ aws --version
aws-cli/2.15.32 Python/3.11.8 Linux/6.1.79-99.167.amzn2023.x86_64 exe/x86_64.amzn.2023 prompt/off

~/environment $ python3 --version
Python 3.9.16

boto3がimportされているか確認

~/environment $ pip3 show boto3
Name: boto3
Version: 1.34.84
Summary: The AWS SDK for Python
Home-page: https://github.com/boto/boto3
Author: Amazon Web Services
Author-email: 
License: Apache License 2.0
Location: /home/ec2-user/.local/lib/python3.9/site-packages
Requires: botocore, jmespath, s3transfer
Required-by: 

importされていない場合は下記コマンドでインストールする

pip3 import boto3

コマンド構文

python3 get-log.py 'log group' 'log stream'

ソースコード

get-log.py
import boto3, argparse, csv, traceback, time, logging, datetime, json

#get now time
t_delta = datetime.timedelta(hours=9)
jst = datetime.timezone(t_delta, 'JST')
now = datetime.datetime.now(jst)
d = '{:%Y%m%d%H%M}'.format(now)

#log config
logger = logging.getLogger(__name__)
logger.setLevel('DEBUG')
format = '%(asctime)s : %(levelname)s : %(filename)s - %(message)s'
fl_handler = logging.FileHandler(filename='./{}_get_result.log'.format(d),encoding='utf-8')
fl_handler.setFormatter(logging.Formatter(format))
logger.addHandler(fl_handler)

#get log events function
def get_log_events(logs, log_gp, log_st):
    r = logs.get_log_events(logGroupName=log_gp,logStreamName=log_st,startFromHead=True)
    logger.info('Count of events: ' + str(len(r['events'])))
    logger.info('nextForwardToken: ' + r['nextForwardToken'])
    yield r['events']
    loop = [0]
    for i in loop:
        token = r['nextForwardToken']
        r = logs.get_log_events(logGroupName=log_gp,logStreamName=log_st,nextToken=token)
        logger.info('Count of events: ' + str(len(r['events'])))
        logger.info('nextForwardToken: ' + r['nextForwardToken'])
        yield r['events']
        if r['nextForwardToken'] == token:
            break
        else:
            loop.append(i+1)

#main function
def main():
    try:
        #get start time
        start = time.perf_counter()

        #argument
        arg_parser = argparse.ArgumentParser()
        arg_parser.add_argument('log_group')
        arg_parser.add_argument('log_stream')
        args = arg_parser.parse_args()
        log_gp = args.log_group
        log_st = args.log_stream

        logs = boto3.client('logs')
        #open csv file
        with open('log_stream.csv', 'w', encoding='UTF-8') as f:
            w = csv.writer(f)
            sum = 0
            #get log events
            for events in get_log_events(logs, log_gp, log_st):
                sum += len(events)
                for event in events:
                    timestamp = event.get('timestamp')
                    message = event.get('message')
                    w.writerow([timestamp,message])
        logger.info('Summary of events: ' + str(sum))

        #get end time
        end = time.perf_counter()
        time_diff = end-start
        #processiong time
        ptime = datetime.timedelta(seconds=time_diff)
        logger.info('Processing time: ' + str(ptime).rsplit('.')[0])

    #error handler
    except Exception as e:
        traceback.print_exc()

#from here
if __name__ == '__main__':
    main()

個人的ポイント

whileではなくforでループを作ることで多少処理が早くなっている

参考

https://qiita.com/ozzy3/items/fd79d07f42215298e38d

Discussion