---
title: Amazon Macie + Detective on AWS: Data Security Posture Management and Forensic Investigation in Production
description: Two AWS-native services that close the gap between "we have S3 buckets and security findings" and "we know where regulated data lives and how a threat moved through our environment." This guide covers production deployment of Macie for data-security posture management and Detective for forensic graph investigation, when each is worth the cost, and how to run them as a paired data-discovery + investigation pipeline.
url: https://www.factualminds.com/blog/aws-macie-detective-data-security-investigation/
datePublished: 2026-04-28T00:00:00.000Z
dateModified: 2026-04-29T00:00:00.000Z
author: Palaniappan P
category: Security & Compliance
tags: macie, detective, dspm, data-security, threat-detection, forensics, s3-security, aws
---

# Amazon Macie + Detective on AWS: Data Security Posture Management and Forensic Investigation in Production

> Two AWS-native services that close the gap between "we have S3 buckets and security findings" and "we know where regulated data lives and how a threat moved through our environment." This guide covers production deployment of Macie for data-security posture management and Detective for forensic graph investigation, when each is worth the cost, and how to run them as a paired data-discovery + investigation pipeline.

Two AWS-native services close the gap between "we have S3 buckets, GuardDuty findings, and a security backlog" and "we know where regulated data lives, and we can reconstruct any incident in minutes": **Amazon Macie** for data-security posture management (DSPM) and **Amazon Detective** for forensic graph investigation. Most regulated AWS workloads need both — Macie tells you the data and posture story, Detective tells you the incident story.

This guide is for security architects, SOC engineers, compliance leads, and platform owners running regulated workloads on AWS. It covers how to deploy both services in production, when each is worth the cost, the IAM and KMS configuration that determines what they can see, and the operational pattern of running them as a paired pipeline.

> **Need help running Macie + Detective in production?** FactualMinds runs managed SOC engagements with Macie + Detective + Security Hub Essentials at the centre of the stack. [See our managed SOC service](/services/aws-managed-soc-mdr/) or [talk to our team](/contact-us/).

## Why pair Macie and Detective

Most AWS security tooling answers one of two questions: **what is happening?** (GuardDuty, Security Hub) or **what data is at risk?** (S3 Block Public Access, IAM Access Analyzer). The gap is between them — the **investigation question** ("how did this finding actually unfold across our estate?") and the **data-discovery question** ("does the affected bucket contain regulated data?").

- **Macie** fills the data-discovery gap. It continuously classifies S3 contents, surfaces unencrypted or publicly accessible buckets that contain sensitive data, monitors policy changes that could expose data, and produces findings that Security Hub aggregates alongside GuardDuty.
- **Detective** fills the investigation gap. It ingests GuardDuty, VPC Flow Logs, CloudTrail, EKS audit logs, and Route 53 logs into a pre-built entity graph that lets an analyst click from a finding to every linked behaviour over the prior 30 days.

Together they give a regulated workload the standard "where is the data + what happened to it" story that auditors, regulators, and post-incident reviewers expect.

## Part 1: Amazon Macie in production

### Step 1: Enable Macie at the Organization level

Designate a delegated administrator account (usually your security/audit account):

```bash
aws macie2 enable-organization-admin-account \
    --admin-account-id 111122223333
```

Then enable Macie across member accounts. The Macie console offers click-through enablement; for IaC, use the Terraform `aws_macie2_member` resource or the AWS CDK `aws-macie` construct.

### Step 2: Configure the discovery scope

Macie has two modes:

- **Automated sensitive-data discovery** — continuously samples a small percentage of objects across all buckets and surfaces statistical findings ("this bucket contains 12% objects with PII identifiers"). Low cost, broad coverage.
- **Sensitive data discovery jobs** — full scan of specified buckets/prefixes on a schedule or one-off. Per-GB pricing; precise findings.

Production pattern: enable automated discovery at the org level for broad visibility, plus targeted jobs against known-sensitive prefixes (production application data, backups, data-lake landing zones, log archives, ML training data) on a weekly or monthly cadence.

### Step 3: Define managed and custom data identifiers

Macie ships ~140 managed identifiers covering common PII, PHI, PCI, financial, credentials, and identifiers (national IDs, passport numbers, IBANs, etc.). For workload-specific identifiers, define custom data identifiers using regex with optional keyword and proximity rules:

```yaml
# Custom: internal customer ID format
Pattern: 'CUST-[0-9]{8}-[A-Z]{2}'
Keywords: ['customer_id', 'cust_id']
MaximumMatchDistance: 50
```

Combine identifiers into "managed data identifier selectors" and attach them to discovery jobs.

### Step 4: KMS key policy for encrypted buckets

For SSE-KMS-encrypted buckets to be scannable, the customer-managed CMK key policy must grant decrypt to the Macie service-linked role:

```json
{
  "Sid": "AllowMacieDecrypt",
  "Effect": "Allow",
  "Principal": {
    "Service": "macie.amazonaws.com"
  },
  "Action": ["kms:Decrypt", "kms:DescribeKey"],
  "Resource": "*",
  "Condition": {
    "StringEquals": {
      "aws:SourceAccount": "111122223333"
    }
  }
}
```

Without this entry Macie produces a "could not analyse" finding for the encrypted objects. For highly regulated workloads where Macie must never decrypt, use SSE-C — and accept the loss of automated discovery.

### Step 5: Route findings to Security Hub and downstream

Macie findings flow automatically to Security Hub when both are enabled. From Security Hub, route critical findings to:

- **EventBridge** for automated containment (block bucket access, page on-call, file ticket).
- **Amazon Detective** for investigation (when sensitive data exposure correlates with suspicious access patterns).
- **AWS Audit Manager / Config conformance pack evidence** for compliance documentation.

A common containment pattern: an EventBridge rule on Macie finding type `Policy:IAMUser/S3BucketPublic` triggers a Lambda that adds a deny statement to the bucket policy, snapshots the bucket inventory, and notifies the bucket owner.

### Cost management

Macie pricing has two components — bucket evaluation (flat per bucket per month for the inventory and automated discovery) and job-based discovery (per GB of S3 data scanned). Cost-control patterns:

- Scope discovery jobs to known-sensitive prefixes, not full buckets.
- Use object filtering (S3 Storage Class, file extension, last-modified date) to skip ephemeral data.
- Run full scans monthly or quarterly; rely on automated discovery between full scans.
- For multi-PB data lakes, scope to landing/raw and ingestion-buffer prefixes; downstream curated layers usually do not need re-scanning.

## Part 2: Amazon Detective in production

### Step 1: Enable Detective with delegated administration

Detective also supports Organizations delegated administration. The pattern matches Macie:

```bash
aws organizations register-delegated-administrator \
    --account-id 111122223333 \
    --service-principal detective.amazonaws.com
```

Then enable Detective in the administrator account and add member accounts. Detective will start ingesting CloudTrail, VPC Flow Logs, GuardDuty findings, EKS audit logs, and Route 53 query logs from each enabled member.

### Step 2: Configure data sources

Detective auto-ingests:

- CloudTrail management and data events.
- VPC Flow Logs.
- GuardDuty findings (every detector type).
- EKS audit logs.
- Route 53 Resolver query logs.

You do not enable each separately — Detective reads from the AWS-side log infrastructure. The single configuration step is enabling the source services in each member account (CloudTrail organization trail, GuardDuty per-account, VPC Flow Logs per VPC, EKS audit logging per cluster).

### Step 3: Investigate from a finding

The standard workflow:

1. A GuardDuty finding fires (e.g. `UnauthorizedAccess:IAMUser/MaliciousIPCaller`).
2. Click "Investigate in Detective" from the finding details.
3. Detective opens the entity graph centered on the principal that triggered the finding.
4. Look at: the principal's behavior over the prior 30 days, every API call (volume, type, timing), every resource accessed, every linked finding, every IP the principal connected from, every Service Control Policy that allowed the action.
5. Pivot to related entities (the IP address, the affected S3 bucket, the EKS pod) and see the same graph from their perspective.

The value compared to writing CloudTrail Lake queries: investigation that took 4 hours of SQL takes 15 minutes of clicking. Multi-entity correlation (this principal accessed this bucket from this IP at this time) is built in.

### Step 4: Use Detective findings groups

Findings groups bundle related findings into a single investigation thread — useful when a single attack chain produces 5-15 GuardDuty findings across multiple resources. The group view shows the chain as a timeline rather than a flat list and lets the analyst document the full investigation in one place.

### Step 5: Integrate with the SOC tooling

Detective integrations:

- **AWS Security Hub** — Detective links from any Security Hub finding for AWS-source findings.
- **SIEM / SOAR** — Detective findings export to Security Lake (OCSF 1.1 format) for downstream Splunk, Sentinel, Chronicle, or Elastic ingestion.
- **Ticketing** — EventBridge rules on Detective finding-group changes file Jira/ServiceNow tickets with the investigation URL.

### Cost management

Detective pricing is per-account-per-region-per-month plus log-volume-based ingestion. Cost-control patterns:

- Scope which accounts are enabled — production and shared-services first; sandbox accounts last.
- Use the 30-day rolling graph rather than longer retention unless your investigation cadence demands it.
- Disable in regions you do not deploy to.

## Operational pattern: paired pipeline

The mature pattern runs Macie and Detective as a paired pipeline:

1. **Continuous discovery** — Macie automated discovery surfaces new sensitive data. Findings flow to Security Hub.
2. **Posture monitoring** — Macie policy findings (public bucket, unencrypted bucket containing PII) trigger immediate containment via EventBridge.
3. **Threat detection** — GuardDuty produces behavioural findings; Security Hub aggregates them with Macie posture findings.
4. **Correlation** — when a GuardDuty finding involves an S3 bucket Macie has flagged as containing sensitive data, the severity is auto-elevated and an investigation is auto-opened in Detective.
5. **Investigation** — analyst opens Detective from the elevated finding, reconstructs the chain, documents the timeline.
6. **Containment** — pre-built runbooks (EventBridge → Step Functions → Lambda) execute the standard containment steps; the analyst overrides only when novel.
7. **Documentation** — investigation notes, Detective screenshots, and Macie scan results form the audit evidence; export to S3 with Object Lock for the regulator-required retention period.

## Common pitfalls

1. **Over-scoping Macie.** Running full-bucket scans across petabyte data lakes is expensive and produces low-signal findings. Scope to known-sensitive prefixes and rely on automated discovery for the rest.
2. **Forgetting KMS key policy entries.** Macie cannot scan SSE-KMS buckets without decrypt permission on the customer-managed CMK. Add the policy entry when designing the encryption scheme.
3. **Enabling Detective in dev/sandbox accounts.** The cost-vs-value ratio is poor for ephemeral environments. Enable in security-relevant accounts first.
4. **Treating Macie findings as alerts.** Macie produces both posture findings (which are actionable alerts) and discovery findings (which are inventory data). Build different EventBridge routing for each.
5. **Skipping the integration step.** A Macie finding sitting in the Macie console alongside a GuardDuty finding sitting in the GuardDuty console is two-thirds of the value. Route both to Security Hub Essentials and pivot to Detective from there.

## Where to go next

- Read the **[Amazon Macie User Guide](https://docs.aws.amazon.com/macie/)** and **[Amazon Detective User Guide](https://docs.aws.amazon.com/detective/)**.
- Pair with **[GuardDuty in production](/blog/aws-guardduty-threat-detection-production-guide/)** and **[Security Hub compliance monitoring](/blog/how-to-set-up-aws-security-hub-compliance-monitoring/)**.
- Browse the **[AWS Security & Compliance hub](/security-compliance/)**, the **[Data Security subtopic](/security-compliance/data-security/)**, and the **[Threat Detection & Response subtopic](/security-compliance/threat-detection/)**.

Macie and Detective are not the most-deployed AWS security services — most teams discover them after a near-miss incident or a regulated audit asks where their PII lives. Deploying them before that conversation, scoped intelligently, is the difference between a one-day investigation and a one-month audit finding.

## FAQ

### When is Amazon Macie worth the cost vs S3 Block Public Access alone?
Block Public Access prevents accidental exposure; Macie tells you what would have been exposed if a bucket had been opened. The two answer different questions and most regulated workloads need both. Macie continuously discovers and classifies sensitive data — managed identifiers for ~140 PII, PHI, PCI, and financial categories plus custom regex, keyword, and ML-based classifiers — across your S3 estate, surfaces unencrypted or publicly accessible buckets containing sensitive data, monitors policy and ACL changes, and feeds findings to Security Hub. The cost — bucket evaluation per account per month plus per-GB job-based scans — is meaningful at petabyte scale, so most teams scope Macie to known-sensitive prefixes (production application data, backups, data-lake landing zones, log archives that may contain PII spillover) rather than the full account. For HIPAA/PCI/GDPR/DORA scope, Macie is the cheapest credible "where is the regulated data?" answer.

### When should we add Amazon Detective on top of GuardDuty?
Add Detective when finding triage takes more than five minutes per investigation. Detective ingests GuardDuty findings, VPC Flow Logs, CloudTrail, EKS audit logs, and Route 53 query logs and pre-builds an entity graph (IPs, principals, resources, finding chains) covering the prior 30 days. You click into a GuardDuty finding and see the linked behaviour — every API call, every network connection, every resource change — without writing CloudWatch Logs Insights queries. Most teams hit the cost-vs-time threshold at ~50 GuardDuty findings/week or once a security analyst is dedicated to AWS investigations. Below that, jumping straight to CloudTrail Lake or Athena on Security Lake is acceptable. Detective is also useful for HIPAA / SOC 2 / DORA evidence — investigators can show the auditor the exact timeline of an incident with two clicks rather than three days of log work.

### Macie vs third-party DSPM tools (Wiz, Cyera, Varonis) — when does each win?
Macie wins for AWS-only S3 estates with regulated data: it integrates natively, sends findings to Security Hub, and the per-GB job-scan pricing is reasonable for petabyte-scale data lakes. Macie does not cover non-S3 data stores — RDS, DynamoDB, Aurora, OpenSearch, third-party SaaS — so its scope is "S3 plus what you can pipeline into S3 for scanning." A third-party DSPM tool earns its line item when you span multi-cloud (Azure Storage, GCP GCS), need to discover sensitive data in non-S3 stores (RDS table columns, DynamoDB attributes), need attack-path graphs ("if this bucket is breached, which other resources are at risk?"), or need shadow-data discovery (data outside known-sensitive prefixes). Many regulated AWS-only teams run both: Macie as the continuous evidence engine for compliance, and a third-party for the deeper attack-path and shadow-data analysis.

### Can Macie scan inside encrypted S3 objects?
Macie scans encrypted S3 objects when it can read them — which means the IAM role Macie uses must have decrypt permission on the relevant KMS keys. For SSE-S3 (Amazon-managed keys), Macie reads transparently. For SSE-KMS with AWS-managed keys, Macie requires the AWS-managed key to grant decrypt to the Macie service-linked role (which it does by default). For SSE-KMS with customer-managed CMKs, you must add a key policy entry granting `kms:Decrypt` to the Macie service principal — without that, Macie produces a "could not analyse" finding for those objects. SSE-C (customer-provided keys) cannot be scanned because Macie does not have access to the customer key. Plan key policy for Macie compatibility when designing the encryption scheme; for highly regulated workloads where Macie must never decrypt, SSE-C is acceptable but you lose the data-discovery capability.

### How does Detective handle multi-account AWS Organizations setups?
Detective integrates natively with AWS Organizations through delegated administrator. You designate one account (typically your security/audit account) as the Detective administrator; that administrator can enable Detective across the organization, ingest findings and logs from all member accounts, and run cross-account investigations from a single console. The behavior graph spans every member account so an incident chain that crosses account boundaries (initial access in account A, lateral movement to account B, data staging in account C) is visible as one investigation rather than three disconnected timelines. The cost model is per account per region per month plus log-volume-based ingestion — for large multi-account deployments, scope which member accounts are enabled (production-critical first, sandbox and ephemeral last) to control spend.

---

*Source: https://www.factualminds.com/blog/aws-macie-detective-data-security-investigation/*
