AI & assistant-friendly summary

This section provides structured content for AI assistants and search engines. You can cite or summarize it when referencing this page.

Summary

A security hardening PR in the AWS CLI applied chmod 0600 to any output path — including /dev/null — silently breaking Lambda invocations, S3 streaming commands, and every other process on affected hosts overnight.

Entity Definitions

Lambda
Lambda is an AWS service discussed in this article.
S3
S3 is an AWS service discussed in this article.
security hardening
security hardening is a cloud computing concept discussed in this article.

The AWS CLI Bug That Broke /dev/null Across Your Entire System

DevOps & CI/CD Palaniappan P 9 min read

Quick summary: A security hardening PR in the AWS CLI applied chmod 0600 to any output path — including /dev/null — silently breaking Lambda invocations, S3 streaming commands, and every other process on affected hosts overnight.

Table of Contents

Engineers on r/aws woke up on April 10 to discover their Lambda invocations were silently failing, their S3 streaming commands hanging, and their shell scripts suddenly unable to write to /dev/null. The culprit: a single AWS CLI pull request merged 24 hours earlier that applied restrictive file permissions to every output path the CLI touched — including system device files that should never have been modified. This post breaks down what happened, how to check if you were affected, and the defensive patterns that should prevent this class of incident from happening to you.

What Happened: A Security Hardening PR With Unintended Blast Radius

The Intent Behind PR #10197

On April 9, 2026, the AWS CLI team merged PR #10197 to tighten output file security. The goal was sound: whenever commands like aws s3 cp, aws s3 select, or aws lambda invoke write output to a file path, restrict that file to owner-only read/write permissions (0600) via os.chmod(). This is a legitimate security improvement — output files containing presigned URLs, Lambda response payloads, or S3 object metadata should not be world-readable on a shared system.

The implementation was straightforward: after writing output, call os.chmod(output_path, 0o600). Simple, direct, and — as it turned out — incomplete.

The Bug: chmod Applied to Every Output Path

The os.chmod() call was applied unconditionally to whatever path string the caller provided. No type checking. No validation. If you passed /dev/null, the CLI would dutifully call os.chmod("/dev/null", 0o600).

And that’s where the explosion happened.

/dev/null is a character device, not a regular file. But os.chmod() works perfectly fine on device nodes — it just changes the permission bits on the device’s filesystem entry. Before the bug, /dev/null had permissions crw-rw-rw- (readable and writable by everyone). After the bug, it became crw------- (readable and writable by root only).

Every process on the system that tried to write to /dev/null after that point received EACCES: Permission denied. For a system utility that literally exists to discard data, this was catastrophic.


Which Commands Were Affected

aws lambda invoke

This is the most common pattern hitting the bug: discarding Lambda function output in scripts or CI/CD pipelines.

aws lambda invoke \
  --function-name my-function \
  --payload '{"key": "value"}' \
  /dev/null

This command invokes a Lambda function and writes the response JSON to /dev/null to suppress output. Before the bug, this worked fine. After PR #10197 merged, the CLI would write the response and then call os.chmod("/dev/null", 0o600), breaking the device node for every subsequent process on the host.

aws s3 cp and aws s3 select

Less common, but following the same code path:

# Test if an S3 object exists without saving it locally
aws s3 cp s3://bucket/key /dev/null

# Stream query results without persisting to disk
aws s3 select --expression "SELECT * FROM S3Object" /dev/null

Both trigger the same vulnerability.

What Was Never Affected: Shell-Level Redirection

Here’s the critical distinction: shell redirection was never vulnerable.

# These patterns are immune — shell controls the file descriptor
aws lambda invoke --function-name fn /dev/stdout > /dev/null
aws s3 cp s3://bucket/key - > /dev/null
aws lambda invoke --function-name fn /dev/stdout 2>&1 | cat > /dev/null

When you use shell redirection, the shell opens the file descriptor and hands it to the CLI process. The CLI never sees /dev/null as a string path — it only writes to the file descriptor. The chmod() call cannot reach it.

This is why the bug went unnoticed in codebases where engineers followed shell best practices. Only codebases passing /dev/null as a direct argument to the CLI were affected.


How to Know If Your System Was Hit

Symptom Pattern

If your infrastructure was running an affected AWS CLI version during the window of April 9–10, 2026, watch for these symptoms:

  • Lambda invocations in CI/CD suddenly fail with permission errors or hanging indefinitely
  • Shell scripts that write to /dev/null start returning permission denied errors, even for unrelated commands like curl, git, or systemctl
  • Any process on the host attempting to redirect output to /dev/null fails
  • The failure appears within minutes of an AWS CLI auto-update, making it disorienting to diagnose

The common thread: all these failures happen after the first AWS CLI command that targeted /dev/null as an output path. It cascades outward from that initial breakage.

Check /dev/null Permissions Now

# Check current permissions
$ ls -la /dev/null
crw-rw-rw- 1 root root 1, 3 Apr  9 12:00 /dev/null

# If you see crw------- or any restricted form, it was affected:
$ ls -la /dev/null
crw------- 1 root root 1, 3 Apr  9 14:32 /dev/null

# Restore immediately if broken
$ sudo chmod 0666 /dev/null

# Verify restoration
$ ls -la /dev/null
crw-rw-rw- 1 root root 1, 3 Apr  9 15:00 /dev/null

On Amazon EC2 instances, you can also restart the instance — the device node is recreated from the initramfs at boot. For containerized workloads, restarting containers fixes the problem; the host’s device permissions do not propagate into fresh container namespaces (they use their own device mounts).

Trace It to an AWS CLI Version

# Check your current version
$ aws --version
aws-cli/2.x.x Python/3.x.x Linux/5.x.x ...

Compare the version number against the official AWS CLI GitHub releases at github.com/aws/aws-cli/releases. Any v2 build from April 9, 2026 forward (until the fix on April 10) is in the danger zone. AWS CLI v1 was never affected.


The Fix: Before and After

The Broken Code Path (PR #10197)

# PR #10197 — BROKEN: chmod applied to any path without type checking
def _set_output_file_permissions(output_path):
    # No stat check — works on device files, symlinks, sockets, etc.
    os.chmod(output_path, 0o600)

This is the problem in its entirety. No defensive coding. No awareness of what filesystem object the caller actually provided.

The Corrected Code Path (PR #10210)

# PR #10210 — FIXED: stat check guards against non-regular files
import stat

def _set_output_file_permissions(output_path):
    file_stat = os.stat(output_path)

    # Skip if not a regular file
    if not stat.S_ISREG(file_stat.st_mode):
        return  # device file, symlink, socket, named pipe, etc.

    # Only chmod actual files the CLI created
    os.chmod(output_path, 0o600)

The fix uses os.stat() to inspect the file’s type before calling os.chmod(). The stat.S_ISREG() check returns True only for regular files, not for device nodes, symlinks, named pipes, or sockets. This is the defensive pattern that should be standard for any tool that applies filesystem permissions to caller-provided paths.

Upgrade and Remediation Steps

# Step 1: Check your AWS CLI version
$ aws --version

# Step 2: Upgrade to the latest version (includes the PR #10210 fix)
$ pip install --upgrade awscli

# Step 3: Verify the upgrade
$ aws --version

# Step 4: If /dev/null was already affected, restore permissions
$ sudo chmod 0666 /dev/null

# Step 5: Verify restoration
$ ls -la /dev/null
crw-rw-rw- 1 root root 1, 3 ...

Lessons for DevOps and CI/CD Teams

Never Pass Special Files as CLI Output Arguments

The fundamental issue: scripts sometimes use /dev/null as a convenient “discard this output” pattern. But when any tool — not just the AWS CLI — calls chmod() or chown() on user-supplied paths without type-checking, you create this vulnerability.

The correct approach is to let the shell handle the redirection:

# AFFECTED PATTERN — /dev/null passed as file argument
aws lambda invoke --function-name fn /dev/null

# SAFE PATTERN — shell redirection at the terminal level
aws lambda invoke --function-name fn /dev/stdout > /dev/null

# ALSO SAFE — temp file with cleanup
TMPFILE=$(mktemp)
trap "rm -f $TMPFILE" EXIT
aws lambda invoke --function-name fn "$TMPFILE"

This pattern is immune to chmod() bugs because the CLI never sees /dev/null as a string path.

Pin Your CLI Version in Production Pipelines

This is the same principle as pinning GitHub Actions versions or Docker image tags: never install “latest” of a critical tool in production automation.

# pip — pin to an exact version
pip install awscli==2.15.30

# Docker — specify an exact tag
FROM python:3.12-slim
RUN pip install awscli==2.15.30

# GitHub Actions — pin the CLI in setup steps
- name: Install AWS CLI (pinned)
  run: pip install awscli==2.15.30

# AWS CodeBuild — override the managed image's CLI version in buildspec.yml
phases:
  install:
    commands:
      - pip install awscli==2.15.30
      - aws --version

Pinning gives you control over when you update, not reactive updates triggered by an auto-upgrade in the middle of a production run.

Validate Toolchain Updates Before Rolling Out

Treat AWS CLI updates like dependency upgrades. When a new CLI version is released:

  1. Test in staging first. Run your Lambda invocations, S3 operations, and other CLI commands in a staging CI/CD pipeline for 24 hours.
  2. Use a canary runner. Update the CLI on one CI runner, monitor it for failures, then roll out to the full fleet if it is stable.
  3. Check the changelog. AWS publishes release notes on the GitHub releases page. Permission changes, behavior changes, and deprecations are flagged there.

This is especially important for tools that interact with production infrastructure. The 24-hour window between PR #10197 and PR #10210 is short, but it was long enough to break hundreds of pipelines.


Broader Pattern: Hardening PRs That Break Infrastructure

Security improvements are necessary. But applying permissions to user-supplied paths without type-checking is a textbook example of a hardening PR that breaks more than it fixes.

The AWS CLI team responded quickly — the fix was merged less than 24 hours after the bug was reported — and the response was targeted: check the file type before chmod-ing it. One function call, three lines of code.

This is the defensive pattern that should be standard in any tool that:

  • Calls os.chmod() on paths supplied by the user
  • Calls os.chown() on paths supplied by the user
  • Applies any filesystem-level security controls to caller-provided paths

The cost of the defensive check is negligible. The cost of not having it is system-level breakage.

For your own scripts, Lambda functions, and CDK constructs: if you are writing permissions to caller-provided paths, add the os.stat() check first. It is the difference between a security improvement and a gotcha that silently breaks production at 2 AM.


Frequently Asked Questions

See the FAQ section above for detailed answers to common questions about versions affected, workarounds, remediation, and preventing this class of incident going forward.


Get Help Auditing Your AWS Toolchain

If this incident surfaced questions about your CI/CD pipeline’s resilience to upstream tool changes, this is exactly the pattern we audit and harden for engineering teams.

FactualMinds works with startups and enterprise DevOps teams to review pipeline security, AWS DevOps dependency management, CLI version pinning, and blast radius from toolchain failures. A one-hour pipeline audit can catch this class of issue before your next 2 AM incident.

Contact us to audit your AWS toolchain →

PP
Palaniappan P

AWS Cloud Architect & AI Expert

AWS-certified cloud architect and AI expert with deep expertise in cloud migrations, cost optimization, and generative AI on AWS.

AWS ArchitectureCloud MigrationGenAI on AWSCost OptimizationDevOps

Ready to discuss your AWS strategy?

Our certified architects can help you implement these solutions.

Recommended Reading

Explore All Articles »