iTranslated by AI
Trying out Auto Scaling for ECS on Fargate with CDK
Hello. I'm Yamanashi, a Web Backend Engineer.
1. Overview
In this article, I will explain how to configure ECS Auto Scaling using CDK.
I will introduce the following two aspects of Auto Scaling:
- Automatic scaling
- Scheduled scaling
1.1 Automatic Scaling
Automatic scaling is a feature that allows an ECS service to automatically increase or decrease the number of tasks in response to the load.
In this guide, we will configure it to scale automatically based on the following conditions:
- Increase task count by 1 (scale-out) when the average CPU utilization is 30% or higher.
- Decrease task count by 1 (scale-in) when the average CPU utilization is 20% or lower.
*Note 1. Scale-out: Increasing the number of virtual machines (tasks in this case) that make up the system.
*Note 2. Scale-in: Decreasing the number of virtual machines (tasks in this case) that make up the system.
1.2 Scheduled Scaling
Scheduled scaling is a feature that allows you to specify a pre-determined time or day of the week to scale in or out, increasing or decreasing the number of ECS service tasks accordingly.
In this guide, we will configure the task count to be automatically adjusted as follows:
- Scale out at 8:00 (increase the minimum number of tasks to 3).
- Scale in at 18:00 (decrease the minimum number of tasks to 1).
2. Implementation
In this article, we will build using an L3 Construct called ApplicationLoadBalancedFargateService.
The complete code for the stack created this time is as follows:
// lib/server-stack.ts
import * as path from "node:path";
import type { StackProps } from "aws-cdk-lib";
import { CfnOutput, Duration, Stack, TimeZone } from "aws-cdk-lib";
import { Schedule } from "aws-cdk-lib/aws-applicationautoscaling";
import type { Construct } from "constructs";
import { Vpc } from "aws-cdk-lib/aws-ec2";
import { ContainerImage, CpuArchitecture, FargateTaskDefinition, OperatingSystemFamily } from "aws-cdk-lib/aws-ecs";
import { ApplicationLoadBalancedFargateService } from "aws-cdk-lib/aws-ecs-patterns";
import { MetricAggregationType } from "aws-cdk-lib/aws-autoscaling";
export class ServerStack extends Stack {
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props);
// NOTE: Create VPC
const vpc = new Vpc(this, "Vpc", { maxAzs: 2 });
// NOTE: Create task definition
const taskDefinition = new FargateTaskDefinition(
this,
"TaskDefinition",
{
runtimePlatform: {
operatingSystemFamily: OperatingSystemFamily.LINUX,
cpuArchitecture: CpuArchitecture.ARM64,
},
},
);
taskDefinition.addContainer("AppContainer", {
image: ContainerImage.fromAsset(path.resolve(__dirname, "../")),
portMappings: [{
containerPort: 80,
hostPort: 80,
}],
});
// NOTE: Create service with Fargate launch type
const fargateService = new ApplicationLoadBalancedFargateService(this, "FargateService", {
taskDefinition,
vpc,
});
// NOTE: Configure Auto Scaling target
const scaling = fargateService.service.autoScaleTaskCount({
minCapacity: 1,
maxCapacity: 5,
});
// NOTE: Scale out/in based on CPU utilization
scaling.scaleOnMetric("StepScaling", {
metric: fargateService.service.metricCpuUtilization({
period: Duration.minutes(1), // Get CPU utilization at 1-minute intervals
}),
scalingSteps: [
{ lower: 30, change: +1 }, // Increase task count by 1 if CPU utilization is 30% or higher
{ upper: 20, change: -1 }, // Decrease task count by 1 if CPU utilization is 20% or lower
],
metricAggregationType: MetricAggregationType.AVERAGE, // Set to scale based on the average value
cooldown: Duration.minutes(1), // Set scaling cooldown period to 1 minute
});
// NOTE: Scale out at 8:00
scaling.scaleOnSchedule("ScaleOutSchedule", {
timeZone: TimeZone.ASIA_TOKYO,
schedule: Schedule.cron({ hour: "8", minute: "0" }),
minCapacity: 3,
});
// NOTE: Scale in at 18:00
scaling.scaleOnSchedule("ScaleInSchedule", {
timeZone: TimeZone.ASIA_TOKYO,
schedule: Schedule.cron({ hour: "18", minute: "0" }),
minCapacity: 1,
});
// NOTE: Output the Load Balancer DNS name
new CfnOutput(this, "LoadBalancerDNS", {
value: fargateService.loadBalancer.loadBalancerDnsName,
});
}
}
Implementation details supplement (Why `scaleOnMetric` was used)
This time, I used the scaleOnMetric method of the ScalableTaskCount class to perform scaling based on CPU utilization (the relevant part is the code block below).
// NOTE: Scale out/in based on CPU utilization
scaling.scaleOnMetric("StepScaling", {
metric: fargateService.service.metricCpuUtilization({
period: Duration.minutes(1), // Get CPU utilization at 1-minute intervals
}),
scalingSteps: [
{ lower: 30, change: +1 }, // Increase task count by 1 if CPU utilization is 30% or higher
{ upper: 20, change: -1 }, // Decrease task count by 1 if CPU utilization is 20% or lower
],
metricAggregationType: MetricAggregationType.AVERAGE, // Set to scale based on the average value
cooldown: Duration.minutes(1), // Set scaling cooldown period to 1 minute
});
In fact, there is another way using the scaleOnCpuUtilization method.
To use the scaleOnCpuUtilization method, you would write the code like this:
// NOTE: Scale out when CPU utilization exceeds 30%
scaling.scaleOnCpuUtilization("CpuScaling", {
targetUtilizationPercent: 30,
scaleInCooldown: Duration.minutes(1),
scaleOutCooldown: Duration.minutes(1),
});
While this approach allows for simpler code, as far as I could find, I couldn't figure out how to specify the CPU utilization threshold for scaling in using scaleOnCpuUtilization.
I suspect that the scale-in condition is automatically created as follows
(if disableScaleIn in scaleOnCpuUtilization is not set to true):
- When the value specified in
targetUtilizationPercentis N,
scale-in occurs if the CPU utilization remains below N * 0.9 for 15 consecutive minutes.
Since I wanted to manually configure the scale-in settings for learning purposes this time, I used the scaleOnMetric method.
3. Verification
3.1. Deployment
Now that the implementation is complete, let's deploy and verify its operation.
Run the following commands one by one to deploy.
cdk bootstrap
cdk deploy
Once the deployment is complete, the Load Balancer's DNS name will be output, so copy it.
(This will be used to verify the automatic scaling.)
3.2. Verifying the Configuration
After deployment, you can open the ECS Management Console to check if the Auto Scaling settings have been correctly applied.
(Auto Scaling settings can be verified under the "Configuration and networking" tab of the ECS service.)

3.3. Testing Automatic Scaling
Next, let's run a command to increase the load on the service to verify if it scales correctly.
*Note: The test is performed starting from a state with 1 task.
How to reset the task count to 1
Run the following command:
# Values must be assigned to the environment variables in advance
aws ecs update-service \
--cluster $EcsClusterName \
--service $EcsServiceName \
--desired-count 1
Run the following commands one by one to send multiple requests to the application.
# Assign the Load Balancer DNS name to an environment variable
export LoadBalancerDNS=sample.elb.amazonaws.com # The Load Balancer DNS name copied earlier
# Use the Apache Bench command to apply load (specify the number of requests with `-n` and concurrency with `-c`)
ab -n 1000 -c 100 http://$LoadBalancerDNS/
After running the commands above, the following results were confirmed from the ECS metrics and ECS events:
-
CPUUtilization Averageexceeded 30%

- The policy was triggered, and the number of running tasks increased.
- The task count increased gradually over time.

Additionally, after stopping the command above, it was confirmed that CPU utilization decreased and the number of tasks decreased accordingly.

3.4. Testing Scheduled Scaling
It was confirmed that scheduled scaling is also working correctly.
- At 8:00, the task count has become 3.


- At 18:00, the task count has become 1.


4. Conclusion
In this article, I explained how to implement Auto Scaling on ECS on Fargate.
I hope you find it helpful!
GitHub Repository Created This Time
Discussion