---
title: GitHub Actions for AWS: Secure CI/CD Pipeline Patterns That Ship Code Safely
description: Production-grade GitHub Actions patterns for AWS workloads — OIDC authentication, pinned actions, blue-green deployments, build caching, and the security mistakes that leave your pipeline open to supply chain attacks.
url: https://www.factualminds.com/blog/github-actions-aws-cicd-security-best-practices/
datePublished: 2026-03-25T00:00:00.000Z
dateModified: 2026-06-10T00:00:00.000Z
author: palaniappan-p
category: DevOps & CI/CD
tags: github-actions, cicd, devops, aws, security
---

# GitHub Actions for AWS: Secure CI/CD Pipeline Patterns That Ship Code Safely

> Production-grade GitHub Actions patterns for AWS workloads — OIDC authentication, pinned actions, blue-green deployments, build caching, and the security mistakes that leave your pipeline open to supply chain attacks.

In March 2023, the tj-actions/changed-files GitHub Action was compromised. Attackers pushed malicious code that printed CI secrets — including AWS credentials, npm tokens, and GitHub tokens — to public workflow logs. Over 23,000 repositories had used this action. Every one of them was potentially exposed.

The pipeline that was supposed to automate safe deployments had become the attack surface.

This is not an edge case. Supply chain attacks targeting CI/CD systems have increased every year since 2020. The combination of broad repository access, stored cloud credentials, and automated execution makes a poorly configured GitHub Actions pipeline one of the most dangerous assets in your infrastructure.

This guide covers the six non-negotiable security principles and the production deployment patterns — OIDC federation, pinned actions, least-privilege permissions, blue-green deployments, build caching, and environment promotion — that ship code safely on AWS.

## The Six Non-Negotiables

Before any implementation detail, these six principles apply to every pipeline, every workflow, every job:

| Principle                                      | What it means                                                                                     |
| ---------------------------------------------- | ------------------------------------------------------------------------------------------------- |
| **Secrets never touch logs — ever**            | No `echo $SECRET`, no debug output, no credential printing under any condition                    |
| **Pin everything**                             | Actions, Docker images, and dependencies are pinned to immutable versions                         |
| **Least privilege always**                     | GITHUB_TOKEN permissions, IAM roles, and cloud credentials are scoped to exactly what's needed    |
| **Rollback faster than deploy**                | Every production deployment has a rollback path that executes faster than the original deployment |
| **Test in staging what you run in production** | CI environment uses identical Docker images and configs to production                             |
| **Every deployment is reversible**             | No forward-only deployments; every release can be unwound                                         |

These are not aspirational guidelines. They are the baseline. Every pattern in this guide is built on top of them.

## OIDC Federation: Eliminate AWS Access Keys From CI/CD

The single most impactful security change you can make to your GitHub Actions pipelines is eliminating stored AWS credentials.

The old approach — storing `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` as GitHub secrets — has a fundamental problem: these are long-lived credentials. If your workflow is compromised, the attacker has keys that remain valid until someone notices and rotates them. In many incidents, that window is days or weeks.

**OIDC federation eliminates this entirely.** GitHub Actions can request a short-lived JWT from GitHub's OIDC provider, and AWS will exchange that token for temporary credentials scoped to a specific IAM role. No stored secrets. No rotation required. Credentials expire automatically after the job completes.

### Setting Up OIDC

**Step 1: Create the IAM OIDC Identity Provider in AWS**

```bash
aws iam create-open-id-connect-provider \
  --url https://token.actions.githubusercontent.com \
  --client-id-list sts.amazonaws.com \
  --thumbprint-list 6938fd4d98bab03faadb97b34396831e3780aea1
```

This is a one-time setup per AWS account.

**Step 2: Create the IAM Role with a Trust Policy**

```json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::123456789012:oidc-provider/token.actions.githubusercontent.com"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
        },
        "StringLike": {
          "token.actions.githubusercontent.com:sub": "repo:your-org/your-repo:*"
        }
      }
    }
  ]
}
```

The `sub` condition locks this role to your specific repository. An attacker who compromises a different repository cannot assume this role.

For tighter control, restrict to a specific branch or environment:

```json
"token.actions.githubusercontent.com:sub": "repo:your-org/your-repo:ref:refs/heads/main"
```

**Step 3: Use the Role in Your Workflow**

```yaml
jobs:
  deploy:
    runs-on: ubuntu-latest
    permissions:
      id-token: write # Required for OIDC
      contents: read
    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/github-actions-deploy
          aws-region: us-east-1
```

No `AWS_ACCESS_KEY_ID`. No `AWS_SECRET_ACCESS_KEY`. The `configure-aws-credentials` action handles the OIDC token exchange automatically.

**Result:** Temporary credentials valid for the job duration, automatically expired, scoped to exactly the IAM role you defined. If the workflow is compromised, the attacker gets credentials that expire in minutes and are limited to what that specific deployment role allows.

## Pinning Actions Against Supply Chain Attacks

The tj-actions incident demonstrated what happens when a widely-used action is compromised at a mutable tag. The attack vector is simple: an attacker gains write access to an action repository and pushes malicious code to a tag like `@v1` or `@main`. Every workflow using that tag gets the malicious version on its next run.

### Understanding the Risk Levels

| Reference            | Example                                                           | Risk                                             |
| -------------------- | ----------------------------------------------------------------- | ------------------------------------------------ |
| `@latest` or `@main` | `uses: actions/checkout@main`                                     | Critical — any push is immediately live          |
| Short tag            | `uses: actions/checkout@v4`                                       | High — tags can be moved to different commits    |
| Full semver          | `uses: actions/checkout@v4.1.1`                                   | Low — but tags remain mutable                    |
| **SHA digest**       | `uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683` | **Zero — immutable, cryptographically verified** |

**Recommendation:**

- For official GitHub Actions (`actions/*`) and AWS official actions (`aws-actions/*`): `@v4` or `@v4.x.x` is acceptable — these organizations have strong security practices and release processes
- For third-party community actions: SHA digest only
- For any action with access to credentials or secrets: SHA digest always

```yaml
# Acceptable — official, well-maintained actions at version tag
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- uses: aws-actions/configure-aws-credentials@v4

# Required for third-party actions — SHA pin
- uses: some-community/action@a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2
```

To find the SHA for any action version, check the action's release page or run:

```bash
gh api repos/actions/checkout/git/refs/tags/v4 --jq '.object.sha'
```

**Add a comment with the version for human readability:**

```yaml
# actions/checkout@v4.1.1
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683
```

Tools like `Dependabot` and `Renovate` can automate SHA pin updates when new versions are released.

## Least-Privilege Permission Scoping

By default, `GITHUB_TOKEN` is granted permissions based on your repository settings — often `read-all` or even `write-all` for legacy configurations. A compromised workflow with write permissions can push to your repository, create releases, modify secrets, and trigger other workflows.

Always declare explicit permissions at the job level:

```yaml
jobs:
  # Read-only PR check — no write access needed
  test:
    runs-on: ubuntu-latest
    permissions:
      contents: read
    steps:
      - uses: actions/checkout@v4
      - run: npm ci && npm test

  # Deploy job needs specific write permissions
  deploy:
    runs-on: ubuntu-latest
    permissions:
      id-token: write # OIDC token for AWS
      contents: read # Checkout code
    steps:
      - uses: actions/checkout@v4
      - name: Deploy
        run: ./scripts/deploy.sh
```

**Common permission patterns:**

| Workflow Type             | Permissions Needed                  |
| ------------------------- | ----------------------------------- |
| Build and test only       | `contents: read`                    |
| Deploy with OIDC          | `contents: read`, `id-token: write` |
| Create GitHub release     | `contents: write`                   |
| Comment on PR             | `pull-requests: write`              |
| Push Docker image to GHCR | `packages: write`                   |

Set the repository-level default to the most restrictive option:

```yaml
# At the top of every workflow file
permissions:
  contents: read # Repository-wide default

jobs:
  # Individual jobs override only what they need
  deploy:
    permissions:
      id-token: write
      contents: read
```

Never use `permissions: write-all`. If a step fails with a permission error, add only the specific permission it needs — do not escalate to write-all as a shortcut.

## Blue-Green Deployments via GitHub Actions + CodeDeploy

Blue-green deployment is the production deployment pattern that eliminates downtime and enables instant rollback. GitHub Actions handles building and pushing your container image; [AWS CodeDeploy](/blog/aws-codepipeline-cicd-pipeline-patterns-for-production/) handles the traffic shifting and automatic rollback.

**Architecture:**

```
GitHub Push (main)
  → GitHub Actions: build, test, push image to ECR
    → Update ECS task definition with new image SHA
      → CodeDeploy: Create green task set
        → Health checks pass
          → Traffic shift: 10% → green (canary validation)
            → [5 minutes observation]
              → 100% traffic → green
                → Blue task set retained for 1-hour rollback window
```

### Complete Workflow

```yaml
name: Deploy to Production

on:
  push:
    branches: [main]

permissions:
  contents: read
  id-token: write

env:
  AWS_REGION: us-east-1
  ECR_REPOSITORY: my-app
  ECS_SERVICE: my-app-service
  ECS_CLUSTER: production
  CONTAINER_NAME: my-app

jobs:
  build-and-push:
    runs-on: ubuntu-latest
    outputs:
      image: ${{ steps.build-image.outputs.image }}

    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.AWS_DEPLOY_ROLE_ARN }}
          aws-region: ${{ env.AWS_REGION }}

      - name: Login to ECR
        id: login-ecr
        uses: aws-actions/amazon-ecr-login@v2

      - name: Build, tag, and push image
        id: build-image
        env:
          ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
          IMAGE_TAG: ${{ github.sha }}
        run: |
          docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG .
          docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
          echo "image=$ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG" >> $GITHUB_OUTPUT

  deploy:
    needs: build-and-push
    runs-on: ubuntu-latest
    environment: production

    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.AWS_DEPLOY_ROLE_ARN }}
          aws-region: ${{ env.AWS_REGION }}

      - name: Download task definition
        run: |
          aws ecs describe-task-definition \
            --task-definition my-app \
            --query taskDefinition \
            > task-definition.json

      - name: Update task definition with new image
        id: task-def
        uses: aws-actions/amazon-ecs-render-task-definition@v1
        with:
          task-definition: task-definition.json
          container-name: ${{ env.CONTAINER_NAME }}
          image: ${{ needs.build-and-push.outputs.image }}

      - name: Deploy to ECS via CodeDeploy
        uses: aws-actions/amazon-ecs-deploy-task-definition@v2
        with:
          task-definition: ${{ steps.task-def.outputs.task-definition }}
          service: ${{ env.ECS_SERVICE }}
          cluster: ${{ env.ECS_CLUSTER }}
          wait-for-service-stability: true
          codedeploy-appspec: appspec.json
          codedeploy-application: my-app-codedeploy
          codedeploy-deployment-group: production-deployment-group
```

**`appspec.json` — CodeDeploy traffic shifting config:**

```json
{
  "version": 0.0,
  "Resources": [
    {
      "TargetService": {
        "Type": "AWS::ECS::Service",
        "Properties": {
          "TaskDefinition": "<TASK_DEFINITION>",
          "LoadBalancerInfo": {
            "ContainerName": "my-app",
            "ContainerPort": 3000
          }
        }
      }
    }
  ],
  "Hooks": [
    {
      "BeforeAllowTraffic": "arn:aws:lambda:us-east-1:123456789012:function:PreDeployCheck"
    },
    {
      "AfterAllowTraffic": "arn:aws:lambda:us-east-1:123456789012:function:PostDeployValidation"
    }
  ]
}
```

Tag your ECR images with the **git commit SHA** — `${{ github.sha }}`. This creates a direct, traceable link from every running container back to the exact source code commit that produced it. When a production incident occurs at 2 AM, you need to know exactly what code is running.

**Rollback:** If [CloudWatch alarms](/blog/aws-cloudwatch-observability-metrics-logs-alarms-best-practices/) trigger during the canary window, CodeDeploy automatically shifts traffic back to the blue task set. Blue remains available for one hour after deployment — the rollback window. If a problem surfaces after the full traffic shift, you can manually trigger a rollback to the previous task set within that window.

## Canary Deployments with Automated Rollback

Canary deployments take a more gradual approach than blue-green: a small percentage of traffic is routed to the new version, then incrementally increased while automated monitoring validates the release.

**Traffic progression:**

```
5% → new version, 95% → current    (10-minute observation)
25% → new version, 75% → current   (10-minute observation)
50% → new version, 50% → current   (10-minute observation)
100% → new version                 (complete)
```

At each step, automated checks query your monitoring system. If error rate or latency exceeds acceptable thresholds, the deployment halts and rolls back to 0%.

**CodeDeploy deployment configuration for ECS:**

```
CodeDeployDefault.ECSCanary10Percent5Minutes
  → 10% traffic to new version, wait 5 minutes, then 100%

CodeDeployDefault.ECSLinear10PercentEvery3Minutes
  → +10% every 3 minutes until 100%

Custom (recommended for production):
  → 5% for 10 minutes, then 25% for 10 minutes, then 100%
```

**Connecting CloudWatch alarms to automatic rollback:**

```yaml
# In your CodeDeploy deployment group configuration
DeploymentGroupConfiguration:
  AlarmConfiguration:
    Alarms:
      - Name: HighErrorRate
      - Name: HighP99Latency
    Enabled: true
    IgnorePollAlarmFailure: false
  AutoRollbackConfiguration:
    Enabled: true
    Events:
      - DEPLOYMENT_FAILURE
      - DEPLOYMENT_STOP_ON_ALARM
```

**Alarm thresholds:**

```yaml
HighErrorRate:
  Metric: HTTPCode_Target_5XX_Count
  Threshold: 10 errors per minute
  EvaluationPeriods: 2
  ComparisonOperator: GreaterThanThreshold

HighP99Latency:
  Metric: TargetResponseTime
  Statistic: p99
  Threshold: 2 seconds
  EvaluationPeriods: 2
  ComparisonOperator: GreaterThanThreshold
```

The error rate threshold is deliberately conservative. A new deployment that introduces a 0.01% error rate increase on high-traffic services represents thousands of failed requests per hour. Catch it at 5% traffic before it affects all users.

## Build Caching: Cut Build Times 50–80%

Build caching is the highest-leverage optimization for CI cost and developer experience. Dependency installation — `npm install`, `pip install`, `gradle dependencies` — typically accounts for 40–70% of total build time. With caching, dependencies are restored from a cache hit in seconds rather than downloaded fresh every run.

### Dependency Caching with `actions/cache`

```yaml
- name: Cache node modules
  uses: actions/cache@v4
  with:
    path: ~/.npm
    key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
    restore-keys: |
      ${{ runner.os }}-node-

- name: Install dependencies
  run: npm ci
```

The cache key includes a hash of `package-lock.json`. When dependencies change, the lock file changes, the hash changes, and a fresh cache is created. When nothing changes, the same cache is restored — skipping `npm ci` entirely or reducing it to a few seconds of validation.

**Cache strategies by ecosystem:**

```yaml
# Node.js
path: ~/.npm
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}

# Python
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements*.txt') }}

# Java (Gradle)
path: |
  ~/.gradle/caches
  ~/.gradle/wrapper
key: ${{ runner.os }}-gradle-${{ hashFiles('**/*.gradle*', '**/gradle-wrapper.properties') }}

# Java (Maven)
path: ~/.m2
key: ${{ runner.os }}-m2-${{ hashFiles('**/pom.xml') }}

# Go
path: ~/go/pkg/mod
key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
```

### Docker Layer Caching

Docker builds are expensive when every layer is rebuilt from scratch. Cache layers using the GitHub Actions cache backend:

```yaml
- name: Set up Docker Buildx
  uses: docker/setup-buildx-action@v3

- name: Build and push
  uses: docker/build-push-action@v6
  with:
    context: .
    push: true
    tags: ${{ env.ECR_REGISTRY }}/${{ env.ECR_REPOSITORY }}:${{ github.sha }}
    cache-from: type=gha
    cache-to: type=gha,mode=max
```

`mode=max` caches all intermediate layers, not just the final image. For Dockerfiles with many dependency installation steps (e.g., `RUN npm ci` before `COPY src/`), this can reduce Docker build time from 4 minutes to 30 seconds on cache hit.

**Structure your Dockerfile for maximum cache effectiveness:**

```dockerfile
# These layers change rarely — cache them aggressively
FROM node:20-alpine AS base
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --only=production

# This layer changes on every commit — rebuild only this
FROM base AS production
COPY src/ ./src/
RUN npm run build
```

### Monorepo: Build Only What Changed

For monorepos, rebuilding every package on every commit is wasteful. Use affected-build detection:

```yaml
- name: Detect changed packages
  id: affected
  run: |
    # Using NX
    npx nx show projects --affected --base=origin/main > affected.txt
    echo "packages=$(cat affected.txt | tr '\n' ',')" >> $GITHUB_OUTPUT

- name: Build affected packages only
  run: npx nx run-many --target=build --projects=${{ steps.affected.outputs.packages }}
```

In a 20-service monorepo, changing one service rebuilds one service — not all twenty. CI cost and time scale with the change, not with the repository size.

**Cost impact of caching:**

| Build Step                       | Without Cache | With Cache Hit | Savings |
| -------------------------------- | ------------- | -------------- | ------- |
| `npm ci` (medium project)        | ~90 seconds   | ~8 seconds     | ~91%    |
| Docker build (no source changes) | ~180 seconds  | ~15 seconds    | ~92%    |
| Full CI run                      | ~12 minutes   | ~3 minutes     | ~75%    |

At GitHub Actions pricing ($0.008/minute on Linux runners), a team running 50 builds per day saves roughly $1,000/month on build costs with effective caching.

## Environment Promotion Workflow

Production deployments should follow a structured promotion path: build once, promote through environments, deploy to production only after human approval.

**Why build once?** If you build separate Docker images for staging and production, you are not testing what you deploy. A build-once model ensures the exact artifact validated in staging is what runs in production.

```
Build (on push to main)
  → Push image to ECR (tagged: commit SHA)
    → Auto-deploy to staging
      → Run integration + smoke tests against staging
        → Manual approval gate (required reviewer)
          → Deploy same image to production
```

### GitHub Environments with Required Reviewers

```yaml
name: Deploy

on:
  push:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    outputs:
      image-tag: ${{ github.sha }}
    steps:
      - uses: actions/checkout@v4
      - name: Build and push
        # ... build steps ...

  deploy-staging:
    needs: build
    runs-on: ubuntu-latest
    environment: staging # Maps to GitHub Environment
    steps:
      - name: Deploy to staging
        run: |
          aws ecs update-service \
            --cluster staging \
            --service my-app \
            --task-definition my-app:${{ needs.build.outputs.image-tag }}

  integration-tests:
    needs: deploy-staging
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run integration tests
        run: npm run test:integration
        env:
          API_URL: https://staging.myapp.com

  deploy-production:
    needs: [build, integration-tests]
    runs-on: ubuntu-latest
    environment: production # Required reviewers block here
    steps:
      - name: Deploy to production
        run: |
          aws ecs update-service \
            --cluster production \
            --service my-app \
            --task-definition my-app:${{ needs.build.outputs.image-tag }}
```

Configure the `production` GitHub Environment with:

- **Required reviewers** — one or two senior engineers who approve production deployments
- **Wait timer** — optional delay after approval before deployment executes
- **Deployment branch rule** — restrict production deployments to the `main` branch only

When the `deploy-production` job is reached, the workflow pauses. Approvers receive a notification, review the change (the PR linked to the commit, integration test results), and approve or reject. Only after approval does the deployment proceed — using the same image SHA that passed staging.

## Reusable Workflows

When the same build and deploy steps appear across multiple repositories, extract them into reusable workflows. DRY pipelines mean a security fix or optimization in the shared workflow propagates to all callers automatically.

**Reusable workflow (`.github/workflows/deploy-ecs.yml` in a central repo):**

```yaml
on:
  workflow_call:
    inputs:
      environment:
        required: true
        type: string
      service-name:
        required: true
        type: string
      cluster:
        required: true
        type: string
    secrets:
      AWS_DEPLOY_ROLE_ARN:
        required: true

jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: ${{ inputs.environment }}
    permissions:
      id-token: write
      contents: read
    steps:
      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.AWS_DEPLOY_ROLE_ARN }}
          aws-region: us-east-1

      - name: Deploy to ECS
        run: |
          aws ecs update-service \
            --cluster ${{ inputs.cluster }} \
            --service ${{ inputs.service-name }} \
            --force-new-deployment
```

**Caller workflow (in each application repo):**

```yaml
jobs:
  deploy:
    uses: your-org/.github/.github/workflows/deploy-ecs.yml@main
    with:
      environment: production
      service-name: my-app
      cluster: production
    secrets:
      AWS_DEPLOY_ROLE_ARN: ${{ secrets.AWS_DEPLOY_ROLE_ARN }}
```

**When to extract into a reusable workflow:**

- The same build steps appear in 3 or more repositories
- Security-sensitive steps (credential setup, vulnerability scanning) should be standardized
- You want enforcement — callers cannot skip steps defined in the reusable workflow

Organization-level reusable workflows live in the `.github` repository and can be called by any repository in your organization.

## Rollback Strategy

Rollback must be faster than the original deployment. If restoring from failure takes longer than the deployment itself, your rollback is a second deployment event — with all the same risks.

**Rollback methods, fastest to slowest:**

| Method                                 | Time to Execute | Best For                                          |
| -------------------------------------- | --------------- | ------------------------------------------------- |
| CodeDeploy re-shift (blue-green)       | ~30 seconds     | ECS blue-green deployments within rollback window |
| ECS task definition revision           | ~2 minutes      | Any ECS deployment                                |
| Previous ECR image tag                 | ~3 minutes      | Container deployments                             |
| CloudFormation stack rollback          | ~5 minutes      | Infrastructure changes                            |
| Full pipeline re-run with prior commit | ~10 minutes     | Last resort                                       |

**Blue-green instant rollback** (within the 1-hour window after deployment):

```bash
aws deploy stop-deployment \
  --deployment-id d-ABC123 \
  --auto-rollback-enabled
```

CodeDeploy shifts traffic back to the blue task set. Running containers are not terminated — traffic routing simply returns to the previous version. Users experience no downtime.

**ECS rollback to previous task definition:**

```bash
# Find the previous task definition revision
PREVIOUS=$(aws ecs describe-task-definition \
  --task-definition my-app \
  --query 'taskDefinition.revision' \
  --output text)

ROLLBACK=$((PREVIOUS - 1))

# Update service to previous revision
aws ecs update-service \
  --cluster production \
  --service my-app \
  --task-definition my-app:$ROLLBACK
```

**Database migrations and rollback:** The most common reason rollbacks fail is a database migration that is not backward-compatible. Always write migrations that run in two phases:

1. **Phase 1** (deploy with new code): Add the new column as nullable. Both old and new code work.
2. **Phase 2** (after rollback window closes): Make the column required, drop the old column.

Never drop a column in the same deployment that removes the code that reads it. If you roll back the code, the column is gone and the old code crashes.

**Every deployment PR should include a rollback runbook:**

```markdown
## Rollback Plan

If this deployment causes issues:

1. Immediate (< 1 hour post-deploy):
   `aws deploy stop-deployment --deployment-id $DEPLOYMENT_ID --auto-rollback-enabled`

2. After rollback window:
   `aws ecs update-service --cluster production --service my-app --task-definition my-app:$PREVIOUS_REVISION`

3. Database: No schema changes in this deployment. Rollback is safe.

Previous task definition: my-app:42
Previous image: 123456789.dkr.ecr.us-east-1.amazonaws.com/my-app:abc123def
```

## Monitoring Pipeline Health

A deployment is not complete when the pipeline finishes. It is complete when production metrics confirm the new version is performing correctly.

**Tag every deployment with traceable metadata:**

```yaml
- name: Tag deployment
  run: |
    echo "Deployment metadata:"
    echo "  Commit: ${{ github.sha }}"
    echo "  Author: ${{ github.actor }}"
    echo "  Workflow: ${{ github.workflow }}"
    echo "  Run: ${{ github.run_id }}"
    echo "  Timestamp: $(date -u +%Y-%m-%dT%H:%M:%SZ)"
```

This output is preserved in the GitHub Actions run log. When a production incident occurs, you can identify the exact deployment run, the commit, and the author in seconds.

**[CloudWatch alarms](/blog/aws-cloudwatch-observability-metrics-logs-alarms-best-practices/) to monitor after deployment:**

| Alarm                    | Threshold            | Action                            |
| ------------------------ | -------------------- | --------------------------------- |
| 5xx error rate           | > 0.5% for 2 minutes | Alert on-call                     |
| P99 response time        | > 2s for 2 minutes   | Alert on-call                     |
| ECS task restarts        | > 3 in 5 minutes     | Alert on-call + consider rollback |
| ALB unhealthy host count | > 0                  | Immediate alert                   |

**SNS notifications for pipeline events:**

```yaml
- name: Notify on failure
  if: failure()
  run: |
    aws sns publish \
      --topic-arn ${{ secrets.ALERTS_SNS_TOPIC }} \
      --message "DEPLOYMENT FAILED: ${{ github.repository }} commit ${{ github.sha }} by ${{ github.actor }}"
      --subject "Deployment Failure"
```

Add status badges to your README:

```markdown
![Deploy](https://github.com/your-org/your-repo/actions/workflows/deploy.yml/badge.svg)
```

A red badge is immediately visible to every engineer who opens the repository. It creates social pressure to fix broken builds quickly.

## Common Anti-Patterns

| Anti-Pattern                         | What Goes Wrong                                                                                                      | Fix                                                                                           |
| ------------------------------------ | -------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------- |
| `uses: action@main`                  | Supply chain attack: attacker pushes malicious code to `main`, your workflow executes it on next run                 | Pin to SHA digest for third-party actions; `@vN` for official actions                         |
| `AWS_ACCESS_KEY_ID` stored as secret | Long-lived credentials exposed if workflow is compromised; rotation requires updating every secret                   | Replace with OIDC federation — no stored credentials                                          |
| `permissions: write-all`             | Compromised workflow has full repository write access — push to main, modify secrets, trigger other workflows        | Explicit `permissions:` block at job level; add only what's needed                            |
| No rollback plan                     | Incident response requires a full re-deployment; recovery takes longer than the original deployment                  | Blue-green with CodeDeploy; always include rollback runbook in PR                             |
| Local environment ≠ CI               | Tests pass locally, fail in CI due to OS, tool version, or dependency differences; debugging is slow and frustrating | Use identical Docker images in local development and CI                                       |
| Emergency patches bypassing pipeline | Changes go directly to production via `kubectl` or console; no audit trail, no tests, no review                      | Build an expedited pipeline track (no staging wait, but same security checks) for emergencies |
| Secrets echoed in debug output       | Credentials printed to workflow logs; accessible to anyone with repository read access                               | Never `echo` secrets; use `::add-mask::` for dynamic values                                   |
| Unpinned Docker base images          | `FROM node:latest` pulls different image on each build; non-deterministic behavior                                   | Pin to specific digest: `FROM node:20.11.0-alpine3.19@sha256:...`                             |

## Building Pipelines That Last

A well-designed GitHub Actions pipeline is not just a deployment mechanism — it is a safety system. It enforces code review through required checks. It validates every change through automated tests. It controls access to production through environment protection rules. It creates an immutable audit trail of every deployment.

The patterns in this guide — OIDC federation, SHA-pinned actions, least-privilege permissions, blue-green deployments, build caching, and structured environment promotion — are the difference between a pipeline that ships code and a pipeline that ships code _safely_.

If you're building this on AWS, the natural complement to GitHub Actions is [AWS CodeDeploy](/blog/aws-codepipeline-cicd-pipeline-patterns-for-production/) for deployment orchestration and [IAM with least-privilege access](/blog/aws-iam-best-practices-least-privilege-access-control/) for every pipeline role. [Secrets Manager and Parameter Store](/blog/aws-secrets-manager-vs-parameter-store-when-to-use-which/) handle runtime secrets. [CloudWatch](/blog/aws-cloudwatch-observability-metrics-logs-alarms-best-practices/) monitors deployment health. These services fit together into a deployment platform that is auditable, recoverable, and resilient by design.

For hands-on help designing and implementing secure CI/CD pipelines on AWS — including GitHub Actions workflows, CodeDeploy blue-green configurations, and cross-account pipeline architecture — see our [DevOps Pipeline Setup services](/services/devops-pipeline-setup/).

For a shorter **threat-model spine** (pipeline gates plus OWASP basics mapped to AWS controls) that complements this guide, see [CI/CD Threat Models and Web App Security on AWS](/blog/aws-cicd-appsec-pipeline-threat-model/).

**May 2026 — full-repo pass vs PR-time gates:** Pipeline SAST and dependency scans catch changed files on every merge. **[AWS Security Agent full-repository code review](/blog/aws-security-agent-full-repository-code-review/)** adds periodic whole-tree analysis for trust-boundary and data-flow defects CVE databases miss. Run both: fast per-commit gates plus quarterly or release-branch full-repo passes — not one instead of the other.

[Contact us to secure your deployment pipeline →](/contact-us/)

## Related reading

- [The AWS CLI Bug That Broke /dev/null Across Your Entire System](/blog/aws-cli-chmod-dev-null-streaming-bug-2026/)
- [AWS Environment Parity: Why Dev/Staging/Prod Drift Costs More Than It Saves](/blog/aws-environment-parity-dev-staging-production/)
- [What DevOps Guides Don](/blog/devops-exercises-aws-production-reality/)
- [DevOps on AWS: CodePipeline vs GitHub Actions vs Jenkins](/blog/devops-on-aws-codepipeline-vs-github-actions-vs-jenkins/)
- [Two Free LocalStack Alternatives in 2026: MiniStack vs floci](/blog/ministack-free-localstack-alternative-aws-emulator/)
- [The Terraform Command Cheat Sheet for AWS Engineers (2026 Edition)](/blog/terraform-commands-cheat-sheet-aws-2026/)
- [How to Build Ultra-Fast Asset Pipelines with Bun, Vite, and Rust-Based Tooling (2026)](/blog/ultra-fast-asset-pipelines-bun-vite-rust/)

## FAQ

### How do I authenticate GitHub Actions to AWS without storing access keys?
Use OIDC federation. GitHub Actions requests a short-lived JWT from `token.actions.githubusercontent.com`, AWS exchanges it for temporary STS credentials scoped to a specific IAM role, and the role's trust policy locks the `sub` claim to your repo (and ideally a specific branch or environment). No long-lived `AWS_ACCESS_KEY_ID` ever sits in GitHub secrets, and credentials expire when the job ends.

### Why pin GitHub Actions to a commit SHA instead of a tag?
Tags are mutable — an attacker who compromises an action's repo can re-point a tag at malicious code, as happened in the March 2025 tj-actions/changed-files incident. Commit SHAs are immutable: `uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11` will always run that exact commit. Use Dependabot or Renovate to bump the SHA on a controlled cadence.

### How do I prevent secret leakage in GitHub Actions logs?
Never `echo` or `cat` secret values. Set `permissions: read-all` at the workflow level and elevate per-job. Use `::add-mask::` for any computed value derived from a secret. Disable debug logging (`ACTIONS_STEP_DEBUG`) in production workflows — it can dump environment variables, including secrets, into logs. Audit log output regularly and configure GitHub's secret scanning push protection on the repo.

### How do I implement blue-green deployments to AWS from GitHub Actions?
For ECS: use AWS CodeDeploy ECS blue/green deployment groups — your workflow registers the new task definition, calls `aws deploy create-deployment`, and CodeDeploy handles traffic shifting and automatic rollback on alarm. For Lambda: weighted aliases with CodeDeploy linear or canary deployment configurations. For ALB-fronted EC2: deploy to the green target group, run health checks, then shift listener weight gradually.

### What is the safest GitHub Actions pattern for deploying to multiple AWS environments?
Use GitHub Environments with required reviewers and environment-scoped secrets. Each environment (dev, staging, prod) maps to a separate IAM role via OIDC trust policy, so a workflow running for `production` can only assume the production role — no cross-contamination. Branch protection rules plus environment approvals create the audit trail compliance frameworks expect.

---

*Source: https://www.factualminds.com/blog/github-actions-aws-cicd-security-best-practices/*
