How to Migrate a Monolith to ECS Fargate Without Downtime
Quick summary: Migrating a monolith from on-premises or EC2 to ECS Fargate enables containerization and serverless compute. This guide covers zero-downtime migration: deploying containers, gradual traffic shifting, and rollback strategies.
Key Takeaways
- Migrating a monolith from on-premises or EC2 to ECS Fargate enables containerization and serverless compute
- Migrating a monolith from on-premises or EC2 to ECS Fargate enables containerization and serverless compute
Table of Contents
Migrating from a monolithic architecture to containers on ECS Fargate unlocks auto-scaling, cost efficiency, and deployment flexibility. The challenge is doing it without downtime — customers expect 24/7 availability.
This guide walks through a zero-downtime migration strategy: containerizing your application, deploying to Fargate in parallel, gradually shifting traffic, and implementing rollback for safety.
Migrating to AWS? FactualMinds helps enterprises execute zero-downtime migrations to ECS, Fargate, and Kubernetes. See our AWS migration services or talk to our team.
Step 1: Understand the Migration Architecture
Zero-downtime migration uses a blue-green deployment pattern:
Users
↓
Load Balancer (ALB)
├─→ Blue (old monolith) — 100% traffic
└─→ Green (new Fargate) — 0% traffic
↓
(after validation)
├─→ Blue (old monolith) — 10% traffic
└─→ Green (new Fargate) — 90% traffic
↓
(after monitoring)
├─→ Blue (old monolith) — 0% traffic (stop)
└─→ Green (new Fargate) — 100% trafficKey components:
- Application Load Balancer (ALB): Routes traffic between versions
- Target Groups: Blue (old) and Green (new) app versions
- Listener Rules: Gradually shift traffic
- Health Checks: Validate Green before accepting traffic
- Monitoring: Catch issues in real-time
- Rollback Plan: Shift back to Blue if Green fails
Step 2: Containerize Your Monolith
Create a Dockerfile for your application:
Example: Node.js Monolith
FROM node:20-alpine
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --only=production
COPY . .
EXPOSE 8080
HEALTHCHECK --interval=10s --timeout=3s --start-period=30s --retries=3 \
CMD node -e "require('http').get('http://localhost:8080/health', (r) => {if (r.statusCode !== 200) throw new Error(r.statusCode)})"
CMD ["node", "server.js"]Example: Python Django/Flask Monolith
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8080
HEALTHCHECK --interval=10s --timeout=3s --start-period=30s --retries=3 \
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8080/health').read()"
CMD ["gunicorn", "--bind", "0.0.0.0:8080", "--workers", "4", "app:app"]Example: PHP/Laravel Monolith
FROM php:8.2-fpm-alpine
WORKDIR /app
COPY composer.json composer.lock ./
RUN composer install --no-dev
COPY . .
EXPOSE 8080
HEALTHCHECK --interval=10s --timeout=3s --start-period=30s --retries=3 \
CMD wget --quiet --tries=1 --spider http://localhost:8080/health || exit 1
CMD ["php", "-S", "0.0.0.0:8080"]Key points:
- Expose port 8080 (or your app’s port)
- Include a HEALTHCHECK endpoint (e.g.,
/health) - Use
node -e,python -c, orwgetfor health checks - Set
--start-periodto app startup time (30s for slow apps)
Build and push to ECR:
aws ecr create-repository --repository-name my-monolith --region us-east-1
docker build -t my-monolith:latest .
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com
docker tag my-monolith:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/my-monolith:latest
docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/my-monolith:latestStep 3: Create ALB and Target Groups
Set up the load balancer:
# Create ALB
aws elbv2 create-load-balancer \
--name my-monolith-alb \
--subnets subnet-1 subnet-2 \
--security-groups sg-alb \
--scheme internet-facing \
--type application
# Get ALB ARN
ALB_ARN="arn:aws:elasticloadbalancing:us-east-1:123456789012:loadbalancer/app/my-monolith-alb/50dc6c495c0c9188"
# Create Blue target group (old monolith on EC2 or on-prem)
aws elbv2 create-target-group \
--name blue-tg \
--protocol HTTP \
--port 8080 \
--vpc-id vpc-12345 \
--health-check-protocol HTTP \
--health-check-path /health \
--health-check-interval-seconds 10 \
--health-check-timeout-seconds 3 \
--healthy-threshold-count 2 \
--unhealthy-threshold-count 3
# Create Green target group (new Fargate)
aws elbv2 create-target-group \
--name green-tg \
--protocol HTTP \
--port 8080 \
--vpc-id vpc-12345 \
--target-type ip \
--health-check-protocol HTTP \
--health-check-path /health \
--health-check-interval-seconds 10 \
--health-check-timeout-seconds 3 \
--healthy-threshold-count 2 \
--unhealthy-threshold-count 3
# Create listener (initially all traffic to Blue)
aws elbv2 create-listener \
--load-balancer-arn $ALB_ARN \
--protocol HTTP \
--port 80 \
--default-actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/blue-tg/50dc6c495c0c9188Step 4: Register Blue Targets (Old Monolith)
Register existing instances to the Blue target group:
# For EC2 instances
aws elbv2 register-targets \
--target-group-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/blue-tg/50dc6c495c0c9188 \
--targets Id=i-1234567890abcdef0 Id=i-0987654321fedcba0
# For on-premises servers, register by IP
aws elbv2 register-targets \
--target-group-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/blue-tg/50dc6c495c0c9188 \
--targets Id=10.0.1.100 Port=8080Verify Blue targets are healthy:
aws elbv2 describe-target-health \
--target-group-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/blue-tg/50dc6c495c0c9188Step 5: Create ECS Cluster and Deploy to Fargate
Create ECS cluster:
aws ecs create-cluster --cluster-name my-monolith-cluster --region us-east-1
# Create CloudWatch log group
aws logs create-log-group --log-group-name /ecs/my-monolithCreate task definition:
cat > task-definition.json <<EOF
{
"family": "my-monolith",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "512",
"memory": "1024",
"executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
"containerDefinitions": [
{
"name": "my-monolith",
"image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/my-monolith:latest",
"portMappings": [
{
"containerPort": 8080,
"hostPort": 8080,
"protocol": "tcp"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/my-monolith",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
},
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3,
"startPeriod": 60
}
}
]
}
EOF
aws ecs register-task-definition --cli-input-json file://task-definition.jsonCreate ECS service (Green):
aws ecs create-service \
--cluster my-monolith-cluster \
--service-name my-monolith-service \
--task-definition my-monolith:1 \
--desired-count 3 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-1,subnet-2],securityGroups=[sg-app],assignPublicIp=DISABLED}" \
--load-balancers targetGroupArn=arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/green-tg/xxxxx,containerName=my-monolith,containerPort=8080 \
--deployment-configuration "maximumPercent=150,minimumHealthyPercent=50" \
--enable-ecs-managed-tagsWait for Green targets to be healthy:
aws elbv2 describe-target-health \
--target-group-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/green-tg/xxxxxStep 6: Test Green Deployment
Before shifting traffic, validate Green in staging:
# Get ALB DNS name
ALB_DNS=$(aws elbv2 describe-load-balancers \
--load-balancer-arns $ALB_ARN \
--query 'LoadBalancers[0].DNSName' \
--output text)
# Test Blue (currently serving all traffic)
curl http://$ALB_DNS/api/status
# Create a temporary listener to test Green directly
aws elbv2 create-listener \
--load-balancer-arn $ALB_ARN \
--protocol HTTP \
--port 8081 \
--default-actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/green-tg/xxxxx
# Test Green on port 8081
curl http://$ALB_DNS:8081/api/status
# Remove temporary listener
aws elbv2 delete-listener --listener-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:listener/app/my-monolith-alb/50dc6c495c0c9188/f2f7dc8efc022ab2Run smoke tests against Green to validate:
- Database connectivity
- Cache/Redis access
- External API calls
- Critical user flows
Step 7: Shift Traffic Gradually (Blue-Green Deployment)
Shift traffic in stages to catch issues early:
Stage 1: 10% to Green
# Update listener to 10% Green, 90% Blue using weighted target groups
aws elbv2 create-rule \
--listener-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:listener/app/my-monolith-alb/50dc6c495c0c9188/f2f7dc8efc022ab2 \
--priority 1 \
--conditions Field=path-pattern,Values='/*' \
--actions Type=forward,ForwardConfig="{TargetGroups:[{TargetGroupArn=arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/blue-tg/xxxxx,Weight=90},{TargetGroupArn=arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/green-tg/xxxxx,Weight=10}]}"Monitor for 30-60 minutes:
- Check Fargate logs:
aws logs tail /ecs/my-monolith --follow - Monitor errors in APM (Datadog, New Relic)
- Check database performance
- Verify no customer complaints
Stage 2: 50% to Green
# Update rule to 50-50
aws elbv2 modify-rule \
--rule-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:rule/app/my-monolith-alb/50dc6c495c0c9188/f2f7dc8efc022ab2 \
--actions Type=forward,ForwardConfig="{TargetGroups:[{TargetGroupArn=arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/blue-tg/xxxxx,Weight=50},{TargetGroupArn=arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/green-tg/xxxxx,Weight=50}]}"Monitor again for 30-60 minutes.
Stage 3: 100% to Green
# Update listener to 100% Green
aws elbv2 modify-rule \
--rule-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:rule/app/my-monolith-alb/50dc6c495c0c9188/f2f7dc8efc022ab2 \
--actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/green-tg/xxxxxKeep Blue running as a rollback safety net for 2-4 weeks.
Step 8: Monitor and Validate
Set up CloudWatch dashboards and alarms:
aws cloudwatch put-metric-alarm \
--alarm-name "Fargate-High-CPU" \
--alarm-description "Alert if Fargate CPU > 80%" \
--metric-name CPUUtilization \
--namespace AWS/ECS \
--statistic Average \
--period 300 \
--threshold 80 \
--comparison-operator GreaterThanThreshold \
--alarm-actions arn:aws:sns:us-east-1:123456789012:ops-team
aws cloudwatch put-metric-alarm \
--alarm-name "Green-TG-Unhealthy" \
--alarm-description "Alert if Green targets become unhealthy" \
--metric-name UnHealthyHostCount \
--namespace AWS/ApplicationELB \
--statistic Sum \
--period 60 \
--threshold 1 \
--comparison-operator GreaterThanOrEqualToThreshold \
--alarm-actions arn:aws:sns:us-east-1:123456789012:ops-teamStep 9: Rollback Strategy
If Green fails, immediately shift traffic back to Blue:
# Shift back to 100% Blue
aws elbv2 modify-rule \
--rule-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:rule/app/my-monolith-alb/50dc6c495c0c9188/f2f7dc8efc022ab2 \
--actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/blue-tg/xxxxxRollback time: 30 seconds. Most customers won’t notice.
Fix the issue in Fargate, then restart the migration:
- Review Fargate logs to identify the error
- Fix the code or configuration
- Deploy new image
- Restart traffic shifting (10% → 50% → 100%)
Production Patterns
Pattern 1: Canary Deployments
Instead of discrete 10%/50%/100%, use a continuous canary:
# Route 5% of traffic to Green every 5 minutes
for i in {1..20}; do
WEIGHT=$((i * 5))
echo "Shifting $WEIGHT% to Green"
aws elbv2 modify-rule \
--rule-arn $RULE_ARN \
--actions Type=forward,ForwardConfig="{TargetGroups:[{TargetGroupArn=$BLUE_TG,Weight=$((100-WEIGHT))},{TargetGroupArn=$GREEN_TG,Weight=$WEIGHT}]}"
sleep 300 # Wait 5 minutes
donePattern 2: Automated Rollback
Set up automatic rollback if error rate spikes:
import boto3
import time
cloudwatch = boto3.client('cloudwatch')
elbv2 = boto3.client('elbv2')
def check_error_rate(metric_name, threshold=5):
"""Check if error rate exceeds threshold (%)"""
response = cloudwatch.get_metric_statistics(
Namespace='AWS/ApplicationELB',
MetricName='HTTPCode_Target_5XX_Count',
Dimensions=[{'Name': 'TargetGroup', 'Value': 'green-tg'}],
StartTime=time.time() - 300,
EndTime=time.time(),
Period=60,
Statistics=['Sum']
)
total_5xx = sum([dp['Sum'] for dp in response['Datapoints']])
total_requests = cloudwatch.get_metric_statistics(...) # Get total requests
error_rate = (total_5xx / total_requests) * 100 if total_requests else 0
return error_rate > threshold
if check_error_rate(threshold=5):
print("ERROR RATE > 5%, ROLLING BACK")
elbv2.modify_rule(...) # Shift back to BlueCommon Mistakes to Avoid
No health checks
- Fargate starts tasks but App Crashes silently
- Always add
/healthendpoint and configure health checks
Shifting traffic too fast
- Shift 100% in one go, issues become customer-facing
- Do it gradually: 10% → 50% → 100% over hours
No monitoring
- Don’t watch metrics while shifting
- Set up Datadog, New Relic, or CloudWatch dashboards
Stopping Blue too soon
- Stopped Blue after 1 hour, Green has a bug, can’t rollback
- Keep Blue running for 2-4 weeks minimum
Misconfigured security groups
- Fargate security group blocks database access
- Verify egress rules allow database/cache connections
Migration Checklist
- Dockerfile created and tested locally
- Image pushed to ECR
- ALB created with Blue and Green target groups
- Blue targets registered and healthy
- ECS cluster and task definition created
- Fargate service deployed, Green targets healthy
- Smoke tests pass against Green
- Monitoring dashboard created
- Rollback procedure documented
- On-call team trained on rollback
- Stage 1 (10%): Shift traffic, monitor 1 hour
- Stage 2 (50%): Shift traffic, monitor 1 hour
- Stage 3 (100%): Shift traffic, monitor 2-4 weeks
- Shutdown Blue after 4 weeks
Next Steps
- Containerize your monolith (1-2 days)
- Set up ALB and target groups (2 hours)
- Deploy Fargate in staging (4 hours)
- Smoke test Green (2 hours)
- Shift traffic gradually in production (4-6 hours)
- Monitor for 2-4 weeks, then shutdown Blue
- Talk to FactualMinds if you need help executing a large-scale migration or want guidance on breaking the monolith into microservices
AWS Cloud Architect & AI Expert
AWS-certified cloud architect and AI expert with deep expertise in cloud migrations, cost optimization, and generative AI on AWS.


