Skip to main content

Container Orchestration

Kubernetes on AWS (EKS)

Managed Kubernetes on AWS with Auto Mode, Hybrid Nodes, Karpenter 1.0, and Graviton-first node pools.

Last updated:April 29, 2026Author:FactualMinds Cloud Integration TeamReviewed by:FactualMinds AWS-certified architects (Solutions Architect – Professional)

AI & assistant-friendly summary

This section provides structured content for AI assistants and search engines. You can cite or summarize it when referencing this page.

Summary

Amazon EKS in 2026: Auto Mode GA, Hybrid Nodes, Karpenter 1.0, Pod Identity, Graviton-first node pools, and ECR enhanced scanning — cheaper, safer K8s.

Key Facts

  • Amazon EKS in 2026: Auto Mode GA, Hybrid Nodes, Karpenter 1
  • 0, Pod Identity, Graviton-first node pools, and ECR enhanced scanning — cheaper, safer K8s
  • Managed Kubernetes on AWS with Auto Mode, Hybrid Nodes, Karpenter 1
  • 0, and Graviton-first node pools
  • What is EKS Auto Mode and when should I use it

Entity Definitions

Bedrock
Bedrock is relevant to kubernetes on aws (eks).
Lambda
Lambda is relevant to kubernetes on aws (eks).
AWS Lambda
AWS Lambda is relevant to kubernetes on aws (eks).
EC2
EC2 is relevant to kubernetes on aws (eks).
S3
S3 is relevant to kubernetes on aws (eks).
RDS
RDS is relevant to kubernetes on aws (eks).
DynamoDB
DynamoDB is relevant to kubernetes on aws (eks).
CloudWatch
CloudWatch is relevant to kubernetes on aws (eks).
IAM
IAM is relevant to kubernetes on aws (eks).
VPC
VPC is relevant to kubernetes on aws (eks).
EKS
EKS is relevant to kubernetes on aws (eks).
Amazon EKS
Amazon EKS is relevant to kubernetes on aws (eks).
ECS
ECS is relevant to kubernetes on aws (eks).
Amazon ECS
Amazon ECS is relevant to kubernetes on aws (eks).
SQS
SQS is relevant to kubernetes on aws (eks).
Ask AI: ChatGPT Claude Perplexity Gemini

Amazon EKS overview

Amazon EKS is AWS-managed Kubernetes. The control plane (API server, scheduler, etcd) is operated by AWS, patched automatically, and deployed across at least three availability zones. You own the data plane — or, on EKS Auto Mode (GA November 2024), you delegate the data plane to AWS as well and consume Kubernetes as an almost-serverless service.

FactualMinds deploys EKS for teams that need Kubernetes portability (multi-cloud, on-prem via EKS Hybrid Nodes, or open-source ecosystem alignment) and for mid-market AWS-only teams that have outgrown ECS or plain Fargate. We default new 2026 clusters to Auto Mode on Kubernetes 1.32 with Graviton-first node pools unless a specific workload says otherwise.

What’s new on EKS in 2026

Why EKS

Kubernetes standard

AWS integration

Managed control plane

EKS Architecture

Control plane (AWS managed)

Data plane (your choice)

Networking

EKS Auto Mode in practice

Use Auto Mode when

Prefer managed node groups when

EKS Hybrid Nodes

Pod Identity vs IRSA

Pod Identity (2026 default)

aws eks create-pod-identity-association \
  --cluster-name my-cluster \
  --namespace production \
  --service-account my-app \
  --role-arn arn:aws:iam::123456789:role/my-app-role

Pods using the my-app ServiceAccount in the production namespace automatically receive temporary credentials via the Pod Identity Agent. No annotation, no OIDC provider, no trust-policy StringEquals dance.

IRSA (legacy / niche)

Karpenter 1.0 patterns we deploy

Observability stack

Graviton cost savings

Reference architecture (2026 default)

                    ┌──────────────────────────────────────────────┐
                    │  AWS-managed control plane (multi-AZ)        │
                    │  api / scheduler / controller-mgr / etcd     │
                    │  audit + authenticator + scheduler logs      │
                    └─────────────────┬────────────────────────────┘
                                      │ (private endpoint via PrivateLink)

   ┌──────────────────────────────────┼──────────────────────────────────┐
   │ Data plane (Auto Mode)           │                                  │
   │  ├── managed Karpenter NodePool  │ ── Pod Identity Agent (per node) │
   │  ├── Graviton-first c8g/m8g/r8g  │ ── VPC CNI (prefix delegation)   │
   │  ├── consolidation policy        │ ── EBS CSI (gp3 default)         │
   │  └── disruption budgets          │ ── AWS LB Controller (ALB+NLB)   │
   └──────────────────────────────────┴──────────────────────────────────┘

   Workloads ── ServiceAccount → PodIdentityAssociation → IAM Role
   Ingress  ── ALB (alb.ingress.k8s.aws/scheme: internet-facing)
   Storage  ── EBS gp3 PVCs / EFS for shared / S3 for objects
   Secrets  ── Secrets Store CSI / HashiCorp VSO → Vault / Secrets Manager
   Images   ── ECR (enhanced scanning, image signing) ← CI attestation
   Telemetry ─ CloudWatch Container Insights + ADOT → Datadog / AMP+AMG
   Audit    ── CloudWatch Logs (90d) + S3 Object Lock (compliance archive)

Failure modes & resilience

1. Karpenter consolidation evicting under-budgeted pods. Default consolidationPolicy: WhenUnderutilized will move pods aggressively. For long-running stateful workloads, set WhenEmpty on the NodePool and define a PodDisruptionBudget (minAvailable) so consolidation cannot violate availability. Disruption budgets at the NodePool level cap voluntary disruptions per hour.

2. Pod Identity Agent crash-loop. Symptom: pods using the ServiceAccount get 403 AccessDenied from STS. Causes: agent DaemonSet pod CrashLoopBackOff (check kubectl logs -n kube-system -l app=eks-pod-identity-agent), Pod Identity Association pointing at a non-existent IAM role, trust policy missing pods.eks.amazonaws.com principal, or IMDS hop limit too low on the node. Auto Mode handles the agent; on managed node groups confirm the agent add-on is healthy.

3. NodePool pinned to a single AZ. A zonal disruption (control-plane outage in one AZ, ELB endpoint flap) takes the workload with it. Always include topology.kubernetes.io/zone In [a, b, c] in NodePool requirements; combine with topologySpreadConstraints on Deployments.

4. gp3 volume detach during node replacement. Auto Mode replaces nodes — StatefulSets with volumeClaimTemplates should explicitly set persistentVolumeReclaimPolicy: Retain and a storageClass with volumeBindingMode: WaitForFirstConsumer. Otherwise an in-flight reschedule can race with detach and the pod stays ContainerCreating for several minutes.

5. --max-unavailable vs PDB collisions. A Deployment’s RollingUpdate strategy plus a strict PDB (minAvailable: 100%) deadlocks the rollout. Always set PDB minAvailable such that replicas - minAvailable >= maxUnavailable.

6. Cluster Autoscaler vs Karpenter coexistence. Running both in the same cluster causes thrash. Pick one. Karpenter for new clusters; Cluster Autoscaler only if a vendor product hard-requires it.

7. EKS minor-version upgrade window. AWS supports current + 3 prior minors (~14 months). Letting a cluster slip to N-4 forces emergency upgrade across multiple breaking changes. Schedule quarterly minor upgrades; test in a staging cluster first.

Observability runbook

Enable control-plane logs at cluster creation:

aws eks update-cluster-config \
  --region eu-west-1 \
  --name my-cluster \
  --logging '{"clusterLogging":[{"types":["api","audit","authenticator","controllerManager","scheduler"],"enabled":true}]}'

Alarms we ship:

AlarmFirst action
cluster_failed_request_count > 0 (control plane)Check audit logs for Forbidden / Unauthorized patterns; review IAM Identity mappings
node_status_condition Ready=false on any nodekubectl describe node; check kubelet, CNI, and SSM agent health
Karpenter nodeclaim_disruption_total spikeInspect NodePool consolidation events; verify PDBs are honored
pod_pending_count > 0 for > 5 minkubectl describe pod → events; NodePool requirements vs pod tolerations / arch mismatch
ECR image-pull error rateVPC endpoint health for com.amazonaws.<region>.ecr.dkr; IAM role ecr:GetAuthorizationToken
ADOT Collector otelcol_exporter_send_failed_metric_pointsBackend (AMP / Datadog) reachability; collector resource limits

Debug path: “Pod stuck Pending”:

  1. kubectl describe pod <name> → Events. Most common: 0/N nodes are available: insufficient memory or node(s) didn't match Pod's node affinity.
  2. If insufficient resources: confirm Karpenter is provisioning (kubectl get nodeclaims); check NodePool requirements allow the pod’s architecture and instance family.
  3. If affinity mismatch: check NodePool labels match pod’s nodeSelector / affinity.
  4. If FailedScheduling on Pod Identity SA: confirm PodIdentityAssociation exists for (cluster, namespace, serviceAccount).

Debug path: “Node not ready”:

  1. kubectl describe node <node> → Conditions section. MemoryPressure, DiskPressure, PIDPressure are first signals.
  2. CloudWatch Container Insights → node detail → kubelet logs.
  3. VPC CNI: kubectl logs -n kube-system -l k8s-app=aws-node for IP exhaustion or ENI attach failures.
  4. If on Auto Mode, the node will be replaced automatically — confirm replacement is in progress before manual intervention.

When EKS is NOT the right call

EKS best practices

Resource management

Auto-scaling

Security

Reliability

$0.10/hr
EKS control plane list price (per cluster)
30-40%
Typical cost savings moving x86 node groups to Graviton3/4
1.32
Target Kubernetes minor version on EKS in 2026

Tools & Calculators

Self-serve calculators and assessments that pair with this integration.

AWS Architecture Review

Have an AWS-certified architect review your EKS cluster design, networking, and cost posture.

Related AWS Services

Consulting engagements that frequently pair with this integration.

AWS Application Modernization — From Legacy to Cloud-Native

AWS application modernization — legacy migration, microservices, containers. Expert consulting from FactualMinds.

AWS DevOps Consulting

AWS DevOps consulting — CI/CD pipeline setup, infrastructure as code (SAM/CDK), and deployment automation.

Hire a Dedicated AWS Consultant | FactualMinds

Hire a dedicated AWS consultant — a certified expert embedded with your team for cloud management, cost optimization, security, and architecture work.

Who typically runs this integration?

The roles that most often own or review this stack.

AWS Solutions for DevOps & Platform Engineers

EKS Auto Mode, OIDC-native CI/CD, supply-chain security, CDK Toolkit v2, and eBPF observability for platform teams building the platform on AWS in 2026.

AWS Solutions for CTOs

Cloud strategy, multi-account governance, agentic AI platform decisions, and FinOps culture for technology leaders scaling AWS in 2026 and beyond.

Related Integrations

Other AWS integration guides commonly deployed alongside this one.

Terraform on AWS

Terraform + AWS in 2026: Stacks GA, ephemeral values, provider-defined functions, Test Framework, OpenTofu 1.8 encryption — vs CDK and CloudFormation.

Datadog with AWS

Datadog on AWS in 2026: unified observability for CloudWatch, EKS, Lambda, Bedrock LLM workloads, and security posture across multi-cloud estates.

HashiCorp Vault on AWS

HashiCorp Vault on AWS: dynamic DB credentials, transit-engine encryption, HCP Vault Secrets, and EKS Secrets Operator vs AWS Secrets Manager guidance.

Frequently Asked Questions

What is EKS Auto Mode and when should I use it?
EKS Auto Mode (GA November 2024) is a fully managed EKS tier where AWS operates the node pool, networking add-ons, load balancing, and storage controllers for you. It uses a managed Karpenter for fast, cost-aware scaling and patches nodes automatically via an ephemeral-node model (replace, not patch in place). Use Auto Mode when your team wants Kubernetes without node operations; keep managed node groups when you need very specific AMI control, custom kernel modules, or a regulated baseline AMI required by your security team.
How does EKS Pod Identity differ from IRSA, and which should I use in 2026?
Pod Identity (GA 2023, matured through 2025) is simpler and strictly better for most new clusters. IRSA required you to (a) create an IAM OIDC identity provider per cluster, (b) write a trust policy with StringEquals on the cluster OIDC issuer and ServiceAccount name, and (c) annotate the ServiceAccount with the role ARN. Pod Identity replaces all of that with a single create-pod-identity-association call and a Pod Identity Agent that runs on each node. IRSA is still required for (1) EC2 workloads outside EKS, (2) clusters running Kubernetes <1.24, and (3) the handful of controllers that only accept token-file auth.
What is Karpenter 1.0 and how does it change node scaling?
Karpenter 1.0 (GA 2024) stabilised the NodeClass/NodePool CRDs and added disruption budgets, consolidation policies, and a proper upgrade path for in-place updates. Compared with Cluster Autoscaler: Karpenter picks the cheapest instance type that fits pending pods (across on-demand, Spot, AMD, Graviton, and various sizes), schedules pods on fresh nodes in under a minute typically, and consolidates underutilised nodes automatically. On EKS Auto Mode, Karpenter is the built-in scheduler — you do not manage it directly.
When should I use EKS Hybrid Nodes versus EKS Anywhere?
EKS Hybrid Nodes (GA November 2024) lets you register on-prem or edge Linux hosts as worker nodes to an EKS control plane running in AWS. The control plane stays in AWS; the workers run anywhere. Use Hybrid Nodes when (a) you want one Kubernetes control plane governing cloud and on-prem workloads, (b) data gravity or latency forces compute close to data, or (c) your on-prem workloads are small enough that running a full EKS Anywhere cluster on-site is overkill. Use EKS Anywhere when on-prem needs full isolation — its own control plane, air-gapped operation, or no dependency on AWS connectivity. For most mid-market hybrid customers in 2026 we default to EKS Hybrid Nodes.
How do I secure container images pulled to EKS?
Three layers. (1) Push only to ECR with enhanced scanning enabled — Amazon Inspector v2 scans the image and its OS and language-package dependencies for CVEs and exploit-probability-index findings. (2) Enforce image signing with Amazon ECR container image signing (AWS Signer), verified on the cluster via a policy engine like Kyverno or Gatekeeper. (3) Pair with Artifact Attestations from your CI (GitHub Actions) so the deploy step verifies SLSA-aligned provenance before calling kubectl apply. For regulated workloads, enable AWS PrivateLink endpoints for ECR so image pulls never transit the public internet.
What is the 2026 best practice for logging and observability on EKS?
The default we deploy: CloudWatch Container Insights enhanced observability for AWS-service-native metrics and cluster control-plane metrics, plus AWS Distro for OpenTelemetry (ADOT) Collector as a DaemonSet forwarding application traces and custom metrics to either Amazon Managed Grafana + Managed Prometheus (AWS-native) or Datadog / New Relic / Honeycomb (third party). EKS audit logs go to a CloudWatch log group with 90-day minimum retention and an S3 archive behind Object Lock for compliance. For Bedrock-heavy workloads, layer Datadog LLM Observability on top.
How does Graviton affect EKS cost and what are the gotchas?
Graviton3 and the newer Graviton4 (m8g/c8g/r8g families) deliver 30-40% better price-performance than comparable x86 for most stateless workloads and essentially all typical microservices. The main gotchas: (1) your container images must be multi-arch (linux/amd64 + linux/arm64) — use Docker buildx in CI; (2) some proprietary sidecars (older versions of some APM agents) still lack ARM support; (3) JVM workloads need a JDK with ARM64 support (every modern LTS has it). Karpenter on Auto Mode will bin-pack across AMD and Graviton automatically if both architectures are allowed in the NodePool.

Related Reading

Need Help with This Integration?

Our AWS-certified engineers can design, implement, and operate this integration end-to-end — or review what you already have.