---
title: How to Implement Blue/Green Deployments on ECS with CodeDeploy
description: Blue/green deployments eliminate downtime by running two identical production environments. Traffic switches from blue (old) to green (new) instantly. This guide covers CodeDeploy automation, health check validation, and rollback strategies for zero-downtime releases on AWS ECS.
url: https://www.factualminds.com/blog/how-to-implement-blue-green-deployments-ecs-codedeploy/
datePublished: 2026-04-03T00:00:00.000Z
dateModified: 2026-04-16T00:00:00.000Z
author: Palaniappan P
category: DevOps & CI/CD
tags: how-to-guide, codedeploy, ecs, deployments, aws
---

# How to Implement Blue/Green Deployments on ECS with CodeDeploy

> Blue/green deployments eliminate downtime by running two identical production environments. Traffic switches from blue (old) to green (new) instantly. This guide covers CodeDeploy automation, health check validation, and rollback strategies for zero-downtime releases on AWS ECS.

Blue/green deployments eliminate downtime by running two identical production environments. Traffic switches from the old (blue) to the new (green) version instantly, with automatic rollback if the new version fails health checks.

AWS CodeDeploy automates the entire process: deploys new tasks, validates health, shifts traffic, and rolls back on failure — all without manual intervention.

This guide covers setting up blue/green deployments on ECS with CodeDeploy, validating deployments safely, and implementing rollback strategies.

> **Deploying Applications on AWS?** FactualMinds helps teams implement zero-downtime deployment strategies and CI/CD automation. [See our deployment services](/services/aws-migration/) or [talk to our team](/contact-us/).

## Step 1: Understand Blue/Green Architecture

```
Before Deployment:
  Load Balancer → Blue Task Set (old version, 100% traffic)

During Deployment:
  Load Balancer → Blue Task Set (100% traffic)
              ↘ Green Task Set (starting, 0% traffic)
              (health check)

After Health Check Pass:
  Load Balancer → Blue Task Set (10% traffic, canary)
              ↘ Green Task Set (90% traffic)
              (monitor metrics)

After Validation:
  Load Balancer → Green Task Set (100% traffic, new version)
              ✓ Blue Task Set (terminated)

If Green Fails:
  Load Balancer → Blue Task Set (100% traffic, old version restored)
```

**Key concepts:**

- **Blue**: Current production version
- **Green**: New version being deployed
- **Task Set**: Group of ECS tasks running the same image
- **Traffic Shift**: Move traffic from blue to green gradually
- **Health Check**: Ensure new tasks are ready before shifting traffic

## Step 2: Create ECS Service with CodeDeploy Integration

Create an ECS service configured for blue/green deployments:

```bash
# Create ECS service with CodeDeploy deployment controller
aws ecs create-service \
  --cluster production \
  --service-name api-service \
  --task-definition api:1 \
  --desired-count 3 \
  --load-balancers \
    targetGroupArn=arn:aws:elasticloadbalancing:region:account:targetgroup/api/xxx,\
containerName=api,\
containerPort=3000 \
  --deployment-controller type=CODE_DEPLOY \
  --network-configuration \
    awsvpcConfiguration='{subnets=[subnet-xxx,subnet-yyy],securityGroups=[sg-xxx],assignPublicIp=DISABLED}' \
  --region us-east-1
```

**Key flags:**

- `--deployment-controller type=CODE_DEPLOY` — enables blue/green via CodeDeploy (not ECS rolling deployment)
- `--load-balancers` — ALB target group where traffic is managed
- `--network-configuration` — VPC settings for tasks

## Step 3: Create CodeDeploy Application

```bash
# Create CodeDeploy application
aws codedeploy create-app \
  --application-name api-service \
  --compute-platform ECS

# Create deployment group
aws codedeploy create-deployment-group \
  --application-name api-service \
  --deployment-group-name production \
  --deployment-config-name CodeDeployDefault.ECSLinear10Percent5Minutes \
  --service-role-arn arn:aws:iam::123456789012:role/CodeDeployECSRole \
  --deployment-style triggeringOnDeploymentSuccess=false,deploymentType=BLUE_GREEN
```

**Deployment config options:**

- `CodeDeployDefault.ECSLinear10Percent5Minutes` — 10% traffic shift every 5 mins
- `CodeDeployDefault.ECSCanary10Percent5Minutes` — 10% for 5 mins, then 100%
- `CodeDeployDefault.ECSAllAtOnce` — Instant 100% (risky)

## Step 4: Create appspec.yaml for CodeDeploy

Create `appspec.yaml` in your repository root:

```yaml
version: 0.0
Resources:
  - TargetService:
      Type: AWS::ECS::Service
      Properties:
        TaskDefinition: !Ref TaskDefinition
        LoadBalancerInfo:
          ContainerName: 'api'
          ContainerPort: 3000
        PlatformVersion: 'LATEST'
        NetworkConfiguration:
          AwsvpcConfiguration:
            Subnets:
              - subnet-xxx
              - subnet-yyy
            SecurityGroups:
              - sg-xxx
            AssignPublicIp: DISABLED

Hooks:
  # Pre-traffic validation: test new version before shifting traffic
  - BeforeAllowTraffic: 'validate-deployment'
  # Post-traffic validation: monitor after traffic shift
  - AfterAllowTraffic: 'post-deploy-test'

Phases:
  ApplicationStart:
    OnFailure: ROLLBACK
  ApplicationStop:
    OnFailure: CONTINUE
```

**Key sections:**

- `Resources` — ECS service and task definition
- `Hooks` — Validation scripts before/after traffic shift
- `Phases` — Deployment lifecycle (ApplicationStart, ApplicationStop)

## Step 5: Create Health Check Validation Lambda

CodeDeploy runs a Lambda before allowing traffic shift. This validates the new version:

```python
# validate-deployment.py (Lambda function)
import json
import boto3
import urllib3

def lambda_handler(event, context):
    """Validate green task before traffic shift"""

    # Get deployment info
    codedeploy = boto3.client('codedeploy')
    deployment_id = event['DeploymentId']

    # Get target IP from ECS task
    ecs = boto3.client('ecs')

    # Query ECS task for green task set
    task_response = ecs.list_tasks(
        cluster='production',
        serviceName='api-service',
        desiredStatus='RUNNING'
    )

    tasks = ecs.describe_tasks(
        cluster='production',
        tasks=task_response['taskArns']
    )

    # Get task IP
    task = tasks['tasks'][0]
    ip = task['attachments'][0]['details'][0]['value']  # Private IP

    # Health check: GET /health
    http = urllib3.PoolManager()
    try:
        response = http.request(
            'GET',
            f'http://{ip}:3000/health',
            timeout=5
        )

        if response.status == 200:
            data = json.loads(response.data)

            # Validation checks
            if data.get('status') == 'ok':
                print(f"✓ Health check passed for {ip}")

                # Report success to CodeDeploy
                codedeploy.put_lifecycle_event_hook_execution_status(
                    deploymentId=deployment_id,
                    lifecycleEventHookExecutionId=event['LifecycleEventHookExecutionId'],
                    status='Succeeded'
                )
                return {'statusCode': 200, 'body': 'Validation passed'}
            else:
                raise Exception(f"Health check failed: {data}")
        else:
            raise Exception(f"HTTP {response.status}")

    except Exception as e:
        print(f"✗ Validation failed: {str(e)}")

        # Report failure to CodeDeploy (triggers rollback)
        codedeploy.put_lifecycle_event_hook_execution_status(
            deploymentId=deployment_id,
            lifecycleEventHookExecutionId=event['LifecycleEventHookExecutionId'],
            status='Failed'
        )
        raise
```

Package and deploy:

```bash
# Package Lambda
zip function.zip validate-deployment.py

# Create Lambda function
aws lambda create-function \
  --function-name validate-deployment \
  --runtime python3.11 \
  --handler validate-deployment.lambda_handler \
  --zip-file fileb://function.zip \
  --role arn:aws:iam::123456789012:role/LambdaECSRole

# Give CodeDeploy permission to invoke
aws lambda add-permission \
  --function-name validate-deployment \
  --statement-id AllowCodeDeploy \
  --action lambda:InvokeFunction \
  --principal codedeploy.amazonaws.com
```

## Step 6: Create Task Definition with Health Check

ECS health checks ensure tasks are ready before traffic shifts:

```json
{
  "family": "api",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "256",
  "memory": "512",
  "containerDefinitions": [
    {
      "name": "api",
      "image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/api:latest",
      "portMappings": [
        {
          "containerPort": 3000,
          "protocol": "tcp"
        }
      ],
      "healthCheck": {
        "command": ["CMD-SHELL", "curl -f http://localhost:3000/health || exit 1"],
        "interval": 30,
        "timeout": 5,
        "retries": 3,
        "startPeriod": 60
      },
      "environment": [
        {
          "name": "NODE_ENV",
          "value": "production"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/api",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      }
    }
  ]
}
```

**Health check config:**

- `interval: 30` — Check every 30 seconds
- `timeout: 5` — 5-second timeout for health endpoint
- `retries: 3` — Allow 3 failed checks before marking unhealthy
- `startPeriod: 60` — Wait 60 seconds before first check (app startup time)

## Step 7: Configure ALB Target Group for Traffic Shift

The target group handles traffic distribution between blue and green:

```bash
# Get target group ARN
TARGET_GROUP_ARN="arn:aws:elasticloadbalancing:region:account:targetgroup/api/xxx"

# Modify listener rules to enable traffic shift
aws elbv2 modify-listener \
  --listener-arn arn:aws:elasticloadbalancing:region:account:listener/app/api-alb/xxx/xxx \
  --default-actions Type=forward,TargetGroupArn=$TARGET_GROUP_ARN

# Verify health check settings
aws elbv2 describe-target-groups \
  --target-group-arns $TARGET_GROUP_ARN \
  --query 'TargetGroups[0].{HealthyCount:HealthyThresholdCount,UnhealthyCount:UnhealthyThresholdCount}'
```

## Step 8: Trigger Deployment via CodeDeploy

When you push a new Docker image, trigger a CodeDeploy deployment:

```bash
# Option 1: Manual trigger
aws codedeploy create-deployment \
  --application-name api-service \
  --deployment-group-name production \
  --revision revisionType=S3,s3Location=s3://my-bucket/appspec.yaml \
  --deployment-config-name CodeDeployDefault.ECSLinear10Percent5Minutes

# Option 2: From CI/CD pipeline (GitHub Actions example)
```

GitHub Actions workflow:

```yaml
name: Deploy to ECS

on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Build & push Docker image
        run: |
          aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com
          docker build -t api:${{ github.sha }} .
          docker tag api:${{ github.sha }} 123456789012.dkr.ecr.us-east-1.amazonaws.com/api:latest
          docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/api:latest

      - name: Update task definition
        run: |
          aws ecs update-task-definition \
            --family api \
            --container-definitions '[{"name":"api","image":"123456789012.dkr.ecr.us-east-1.amazonaws.com/api:latest",...}]'

      - name: Deploy with CodeDeploy
        run: |
          aws codedeploy create-deployment \
            --application-name api-service \
            --deployment-group-name production \
            --revision revisionType=S3,s3Location=s3://my-bucket/appspec.yaml
```

## Step 9: Monitor Deployment with CloudWatch

Track deployment health and traffic shift:

```python
import boto3

cloudwatch = boto3.client('cloudwatch')

# Alarm: High error rate on new version
cloudwatch.put_metric_alarm(
    AlarmName='ECS-Green-Error-Rate',
    MetricName='HTTPCode_Target_5XX_Count',
    Namespace='AWS/ApplicationELB',
    Statistic='Sum',
    Period=60,
    Threshold=10,
    ComparisonOperator='GreaterThanThreshold',
    AlarmActions=['arn:aws:sns:us-east-1:123456789012:deployment-alerts'],
    Dimensions=[
        {'Name': 'TargetGroup', 'Value': 'targetgroup/api/xxx'}
    ]
)

# Alarm: High latency on new version
cloudwatch.put_metric_alarm(
    AlarmName='ECS-Green-High-Latency',
    MetricName='TargetResponseTime',
    Namespace='AWS/ApplicationELB',
    Statistic='Average',
    Period=300,
    Threshold=1.0,  # 1 second
    ComparisonOperator='GreaterThanThreshold'
)
```

## Step 10: Production Patterns

### Pattern 1: Canary Deployment (Safer Traffic Shift)

Instead of linear 10% shifts, do canary: 10% for 5 mins, monitor, then 100%.

In appspec.yaml:

```yaml
Hooks:
  - BeforeAllowTraffic: 'validate-deployment'
  - AfterAllowTraffic: 'monitor-canary' # Monitor for 5 mins before full shift
```

Monitor script:

```python
def monitor_canary(event, context):
    """Monitor green version during canary phase"""
    cloudwatch = boto3.client('cloudwatch')

    # Get error rate of green task set
    response = cloudwatch.get_metric_statistics(
        Namespace='AWS/ApplicationELB',
        MetricName='HTTPCode_Target_5XX_Count',
        Dimensions=[...],
        StartTime=datetime.now() - timedelta(minutes=5),
        EndTime=datetime.now(),
        Period=60,
        Statistics=['Sum']
    )

    error_count = sum([dp['Sum'] for dp in response['Datapoints']])

    if error_count > 5:
        # High errors, rollback
        return 'Failed'
    else:
        # Errors acceptable, proceed
        return 'Succeeded'
```

### Pattern 2: Instant Rollback on Error Rate

Use CloudWatch alarms to trigger automatic rollback:

```bash
# If error rate spikes, automatically rollback
aws codedeploy create-deployment-group \
  --... \
  --auto-rollback-configuration '{
    "enabled": true,
    "events": ["DEPLOYMENT_FAILURE", "DEPLOYMENT_STOP_ON_ALARM"]
  }' \
  --alarm-configuration '{
    "enabled": true,
    "alarms": [
      {"name": "ECS-Green-Error-Rate"},
      {"name": "ECS-Green-High-Latency"}
    ]
  }'
```

### Pattern 3: Gradual Environment Variable Updates

Deploy new config without restarting tasks:

```python
# Update ECS service environment variables
ecs.update-service \
  --cluster production \
  --service api-service \
  --task-definition api:2 \
  --force-new-deployment \
  --deployment-configuration '{
    "maximumPercent": 200,
    "minimumHealthyPercent": 50
  }'
```

This creates green tasks with new config, validates, then terminates blue.

## Common Mistakes

1. **Not configuring ALB health checks**
   - ALB doesn't detect task failures
   - Green tasks marked as "healthy" but app crashes
   - Better: Configure health check in task definition + ALB target group

2. **Too-short task startup period**
   - `startPeriod: 10` (10 seconds)
   - App takes 30 seconds to start, health check fails
   - Task is marked unhealthy and killed
   - Better: Set `startPeriod` to app startup time (60-120 seconds)

3. **No post-deployment monitoring**
   - Deploy green, traffic shifts, app crashes 10 mins later
   - Too late to rollback (customers already affected)
   - Better: Monitor for 5-10 mins after 100% shift, auto-rollback on errors

4. **No validation script**
   - Green tasks pass health checks but app bugs cause errors
   - CodeDeploy can't detect logical errors
   - Better: Create validation Lambda that tests critical APIs

5. **Instant traffic shift (BLUE_GREEN instead of LINEAR)**
   - All traffic switches to green immediately
   - If green has issues, 100% of traffic affected
   - Better: Use CodeDeployDefault.ECSLinear10Percent5Minutes

## Cost Estimation

For 3 ECS tasks (256 CPU, 512 MB memory) on Fargate:

| Phase                            | Cost                                |
| -------------------------------- | ----------------------------------- |
| Steady state (blue only)         | 3 tasks × $0.04/hour = $0.12/hour   |
| During deployment (blue + green) | 6 tasks × $0.04/hour = $0.24/hour   |
| Deployment duration              | 15 mins (~$0.06 extra)              |
| **Monthly cost increase**        | 4 deployments × $0.06 = $0.24/month |

Cost is negligible.

## Next Steps

1. Create ECS service with CodeDeploy deployment controller (30 mins)
2. Create CodeDeploy application and deployment group (15 mins)
3. Write appspec.yaml (20 mins)
4. Create validation Lambda function (30 mins)
5. Update task definition with health checks (15 mins)
6. Configure ALB target group (10 mins)
7. Test deployment in staging (1 hour)
8. Deploy to production (15 mins)
9. Monitor metrics and adjust traffic shift pace (ongoing)
10. [Talk to FactualMinds](/contact-us/) if you need help setting up zero-downtime deployments or CI/CD automation

## FAQ

### What is the difference between blue/green and canary deployments?
Blue/green: Run full old + new versions in parallel, traffic switches all-at-once (instant). Example: 100% traffic on blue, health check passes on green, traffic goes 100% to green. If green fails, traffic goes back to blue (instant rollback). Canary: Run old + new, gradually shift traffic (10% → 50% → 100%), monitor metrics at each step. Blue/green is faster (5 min deploy), canary is safer for risky changes (gradual rollout with monitoring). Use blue/green for safe changes, canary for risky ones.

### How does CodeDeploy know when a deployment succeeded?
CodeDeploy checks: (1) ECS task health (passed health checks), (2) ALB target group health (HTTP 200), (3) CloudWatch alarms (if configured), (4) Custom validation scripts (if you write them). A deployment succeeds when: new task passes health checks for 2 min + ALB target group shows 100% healthy. If task fails health check within 5 min, CodeDeploy rolls back to blue (old version).

### Can I rollback a deployment automatically?
Yes. CodeDeploy can auto-rollback on: (1) Task health check failure (within 5 mins), (2) ALB target group health failure, (3) CloudWatch alarm threshold (e.g., error rate >5%), (4) Custom script failure. Configure in appspec.yaml: `RollbackHooks` + `PreTraffic/PostTraffic` validation scripts. If any check fails, CodeDeploy kills green tasks and keeps blue running.

### How much does blue/green on ECS cost?
Cost is double during deployment (blue + green running), single during steady state. Example: 3 tasks (1 vCPU, 2GB RAM) on Fargate costs $0.04 per hour per task. Blue/green cost: ($0.04 × 3 tasks × 2 versions) × 0.5 hours (deployment window) = $0.12. Steady state: $0.12/hour. Most deployments take 10-30 mins, so extra cost is negligible.

### What happens if green tasks fail to start?
CodeDeploy detects task launch failure within 2-5 minutes and automatically rolls back to blue (old version). No manual intervention needed. Blue tasks continue handling traffic during the failure. Logs are available in CloudWatch to debug the issue. Common causes: misconfigured environment variables, insufficient memory, bad container image.

---

*Source: https://www.factualminds.com/blog/how-to-implement-blue-green-deployments-ecs-codedeploy/*
