iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
⚖️

Trying out Auto Scaling for ECS on Fargate with CDK

に公開

Hello. I'm Yamanashi, a Web Backend Engineer.

1. Overview

In this article, I will explain how to configure ECS Auto Scaling using CDK.
I will introduce the following two aspects of Auto Scaling:

  • Automatic scaling
  • Scheduled scaling

1.1 Automatic Scaling

Automatic scaling is a feature that allows an ECS service to automatically increase or decrease the number of tasks in response to the load.
In this guide, we will configure it to scale automatically based on the following conditions:

  • Increase task count by 1 (scale-out) when the average CPU utilization is 30% or higher.
  • Decrease task count by 1 (scale-in) when the average CPU utilization is 20% or lower.

*Note 1. Scale-out: Increasing the number of virtual machines (tasks in this case) that make up the system.
*Note 2. Scale-in: Decreasing the number of virtual machines (tasks in this case) that make up the system.

1.2 Scheduled Scaling

Scheduled scaling is a feature that allows you to specify a pre-determined time or day of the week to scale in or out, increasing or decreasing the number of ECS service tasks accordingly.
In this guide, we will configure the task count to be automatically adjusted as follows:

  • Scale out at 8:00 (increase the minimum number of tasks to 3).
  • Scale in at 18:00 (decrease the minimum number of tasks to 1).

2. Implementation

In this article, we will build using an L3 Construct called ApplicationLoadBalancedFargateService.

The complete code for the stack created this time is as follows:

// lib/server-stack.ts

import * as path from "node:path";
import type { StackProps } from "aws-cdk-lib";
import { CfnOutput, Duration, Stack, TimeZone } from "aws-cdk-lib";
import { Schedule } from "aws-cdk-lib/aws-applicationautoscaling";
import type { Construct } from "constructs";
import { Vpc } from "aws-cdk-lib/aws-ec2";
import { ContainerImage, CpuArchitecture, FargateTaskDefinition, OperatingSystemFamily } from "aws-cdk-lib/aws-ecs";
import { ApplicationLoadBalancedFargateService } from "aws-cdk-lib/aws-ecs-patterns";
import { MetricAggregationType } from "aws-cdk-lib/aws-autoscaling";

export class ServerStack extends Stack {
  constructor(scope: Construct, id: string, props?: StackProps) {
    super(scope, id, props);

    // NOTE: Create VPC
    const vpc = new Vpc(this, "Vpc", { maxAzs: 2 });

    // NOTE: Create task definition
    const taskDefinition = new FargateTaskDefinition(
      this,
      "TaskDefinition",
      {
        runtimePlatform: {
          operatingSystemFamily: OperatingSystemFamily.LINUX,
          cpuArchitecture: CpuArchitecture.ARM64,
        },
      },
    );
    taskDefinition.addContainer("AppContainer", {
      image: ContainerImage.fromAsset(path.resolve(__dirname, "../")),
      portMappings: [{
        containerPort: 80,
        hostPort: 80,
      }],
    });

    // NOTE: Create service with Fargate launch type
    const fargateService = new ApplicationLoadBalancedFargateService(this, "FargateService", {
      taskDefinition,
      vpc,
    });

    // NOTE: Configure Auto Scaling target
    const scaling = fargateService.service.autoScaleTaskCount({
      minCapacity: 1,
      maxCapacity: 5,
    });

    // NOTE: Scale out/in based on CPU utilization
    scaling.scaleOnMetric("StepScaling", {
      metric: fargateService.service.metricCpuUtilization({
        period: Duration.minutes(1), // Get CPU utilization at 1-minute intervals
      }),
      scalingSteps: [
        { lower: 30, change: +1 }, // Increase task count by 1 if CPU utilization is 30% or higher
        { upper: 20, change: -1 }, // Decrease task count by 1 if CPU utilization is 20% or lower
      ],
      metricAggregationType: MetricAggregationType.AVERAGE, // Set to scale based on the average value
      cooldown: Duration.minutes(1), // Set scaling cooldown period to 1 minute
    });

    // NOTE: Scale out at 8:00
    scaling.scaleOnSchedule("ScaleOutSchedule", {
      timeZone: TimeZone.ASIA_TOKYO,
      schedule: Schedule.cron({ hour: "8", minute: "0" }),
      minCapacity: 3,
    });

    // NOTE: Scale in at 18:00
    scaling.scaleOnSchedule("ScaleInSchedule", {
      timeZone: TimeZone.ASIA_TOKYO,
      schedule: Schedule.cron({ hour: "18", minute: "0" }),
      minCapacity: 1,
    });

    // NOTE: Output the Load Balancer DNS name
    new CfnOutput(this, "LoadBalancerDNS", {
      value: fargateService.loadBalancer.loadBalancerDnsName,
    });
  }
}

Implementation details supplement (Why `scaleOnMetric` was used)

This time, I used the scaleOnMetric method of the ScalableTaskCount class to perform scaling based on CPU utilization (the relevant part is the code block below).

// NOTE: Scale out/in based on CPU utilization
scaling.scaleOnMetric("StepScaling", {
  metric: fargateService.service.metricCpuUtilization({
    period: Duration.minutes(1), // Get CPU utilization at 1-minute intervals
  }),
  scalingSteps: [
    { lower: 30, change: +1 }, // Increase task count by 1 if CPU utilization is 30% or higher
    { upper: 20, change: -1 }, // Decrease task count by 1 if CPU utilization is 20% or lower
  ],
  metricAggregationType: MetricAggregationType.AVERAGE, // Set to scale based on the average value
  cooldown: Duration.minutes(1), // Set scaling cooldown period to 1 minute
});


In fact, there is another way using the scaleOnCpuUtilization method.
To use the scaleOnCpuUtilization method, you would write the code like this:

// NOTE: Scale out when CPU utilization exceeds 30%
scaling.scaleOnCpuUtilization("CpuScaling", {
  targetUtilizationPercent: 30,
  scaleInCooldown: Duration.minutes(1),
  scaleOutCooldown: Duration.minutes(1),
});


While this approach allows for simpler code, as far as I could find, I couldn't figure out how to specify the CPU utilization threshold for scaling in using scaleOnCpuUtilization.

I suspect that the scale-in condition is automatically created as follows
(if disableScaleIn in scaleOnCpuUtilization is not set to true):

  • When the value specified in targetUtilizationPercent is N,
    scale-in occurs if the CPU utilization remains below N * 0.9 for 15 consecutive minutes.


Since I wanted to manually configure the scale-in settings for learning purposes this time, I used the scaleOnMetric method.

3. Verification

3.1. Deployment

Now that the implementation is complete, let's deploy and verify its operation.
Run the following commands one by one to deploy.

cdk bootstrap
cdk deploy

Once the deployment is complete, the Load Balancer's DNS name will be output, so copy it.
(This will be used to verify the automatic scaling.)


3.2. Verifying the Configuration

After deployment, you can open the ECS Management Console to check if the Auto Scaling settings have been correctly applied.
(Auto Scaling settings can be verified under the "Configuration and networking" tab of the ECS service.)

auto scaling setting in management console


3.3. Testing Automatic Scaling

Next, let's run a command to increase the load on the service to verify if it scales correctly.
*Note: The test is performed starting from a state with 1 task.

How to reset the task count to 1

Run the following command:

# Values must be assigned to the environment variables in advance
aws ecs update-service \
    --cluster $EcsClusterName \
    --service $EcsServiceName \
    --desired-count 1

Run the following commands one by one to send multiple requests to the application.

# Assign the Load Balancer DNS name to an environment variable
export LoadBalancerDNS=sample.elb.amazonaws.com # The Load Balancer DNS name copied earlier
# Use the Apache Bench command to apply load (specify the number of requests with `-n` and concurrency with `-c`)
ab -n 1000 -c 100 http://$LoadBalancerDNS/


After running the commands above, the following results were confirmed from the ECS metrics and ECS events:

  • CPUUtilization Average exceeded 30%

cpu utilization average

  • The policy was triggered, and the number of running tasks increased.
    • The task count increased gradually over time.

scaling policy triggered


Additionally, after stopping the command above, it was confirmed that CPU utilization decreased and the number of tasks decreased accordingly.

cpu utilization decrease


3.4. Testing Scheduled Scaling

It was confirmed that scheduled scaling is also working correctly.

  • At 8:00, the task count has become 3.

scale out at 8am

scale out at 8am

  • At 18:00, the task count has become 1.

scale in at 6pm

scale in at 6pm

4. Conclusion

In this article, I explained how to implement Auto Scaling on ECS on Fargate.
I hope you find it helpful!

GitHub Repository Created This Time

https://github.com/ren-yamanashi/ecs-auto-scaling

GitHubで編集を提案

Discussion