The average detection coverage rate sits at 24% — and the primary cause is not tool failure but organizational separation. Red teams operate in isolation, blue teams review findings weeks later in a PDF, and the gap between what an adversary actually does and what defenders actually detect remains unmeasured and unmanaged. Purple teaming closes that gap by fusing offense and defense into a single operational cycle where every attack step produces an immediate defensive outcome.

This is part 5 in a series on threat-informed defense. Start with part 1.

The Siloed Team Problem

Part 1 identified siloed purple teams as one of five recurring failure modes in threat-informed defense programs. The pattern is consistent across organizations: a red team executes an adversary emulation, documents findings in a slide deck, delivers it to the blue team in a debrief two weeks later, and moves on. The blue team then tries to reconstruct the attack path from memory and screenshots, writes detection rules based on incomplete context, and tests those rules in isolation. The cycle repeats quarterly — or annually.

Three structural problems emerge from this separation:

ProblemMechanismImpact
Context decayRed team context degrades between execution and debrief; blue team lacks real-time visibility into adversary tooling, timing, and varianceDetection rules target the artifact (filename, hash) instead of the behavior (technique, sub-technique)
Feedback latencyWeeks elapse between attack execution and detection validation; no opportunity to iterate on a technique in real timeA detection gap exposed in Q1 remains open until Q2 or Q3; mean time to remediate a detection gap averages 87 days (Picus Blue Report 2025)
Measurement lossNo per-step observation data; only aggregate pass/fail results recordedMTTD and MTTR cannot be measured for individual techniques; coverage maps remain estimates

Purple teaming is not a team structure — it is an operating model that eliminates all three problems by embedding offensive and defensive operators in the same execution loop.

Purple Teaming Defined

A purple team exercise is a structured, time-boxed engagement where offensive and defensive participants execute and observe adversary techniques in real time. The purple team is not a third team. It is the collaboration layer between red and blue — sometimes a dedicated facilitator, sometimes a shared protocol, always a real-time feedback channel.

Three attributes distinguish purple teaming from other security testing:

  • Technique-level granularity. The unit of work is a single ATT&CK technique or sub-technique — not a full kill chain. Each step is executed, observed, measured, and iterated before proceeding.
  • Immediate feedback. The red operator declares the technique before execution. The blue operator confirms detection (or non-detection) within minutes, not weeks. If detection fails, both sides collaborate on root cause while the telemetry is still in the SIEM.
  • Coverage as the output. The deliverable is not a PDF report — it is an updated coverage map with per-technique detection status, gap classification, and a prioritized remediation backlog.

The Purple Team Exercise Lifecycle

A purple team exercise follows a six-phase lifecycle that maps directly to the five-phase threat-informed cycle introduced in Part 1 (Profile → Map → Assess → Emulate → Iterate):

Phase 1: Scope and Threat Profile

Define the threat group or technique set for the exercise. The selection criteria come from the coverage map: prioritize techniques that are (a) high-prevalence in the relevant threat landscape, (b) currently at gap status in the coverage map, or (c) recently added or modified in ATT&CK updates.

Example scope for a financial services organization targeting APT28 (Fancy Bear):

  • T1566.002 — Spearphishing Link
  • T1195 — Supply Chain Compromise
  • T1078 — Valid Accounts
  • T1059.001 — PowerShell
  • T1053.005 — Scheduled Task
  • T1087.001 — Account Discovery: Local Account
  • T1046 — Network Service Discovery
  • T1070.004 — File Deletion

This scope is narrow enough for a single two-day exercise but broad enough to exercise an attack path with lateral movement and persistence.

Phase 2: Pre-Exercise Telemetry Verification

Before executing a single technique, verify that the required telemetry sources are active and flowing. Part 4 established the gap classification triad: telemetry gap, detection gap, tuning gap. Running an emulation against a telemetry gap wastes time — the blue team cannot detect what they cannot see.

The pre-exercise checklist confirms:

Telemetry SourceRequired ForVerification
Sysmon (EID 1, 7, 10, 11)T1055 Process Injection, T1059.001 PowerShellConfirm Sysmon service running; verify EID 1 events arriving in SIEM with CommandLine field populated
PowerShell ScriptBlock Logging (EID 4104)T1059.001 obfuscated commandsExecute test ScriptBlock; confirm 4104 events arrive with full script text
Windows Security Event Log (EID 4624, 4625, 4672)T1078 Valid Accounts, T1087 Account DiscoveryGenerate test logon events; confirm arrival and field mapping
Azure AD / Entra ID Sign-in LogsT1078.011 Cloud AccountsVerify log export connector or diagnostic settings forwarding to SIEM
EDR telemetryAll endpoint techniquesConfirm agent health and event forwarding with a test process creation

If a telemetry source is missing, pause the exercise for that technique. Document the gap, record it in the backlog, and proceed to techniques where telemetry is present. This discipline separates purple teaming from ad hoc red teaming — every observation is grounded in confirmed data availability.

Phase 3: Execute and Observe (The Core Loop)

The core of a purple team exercise is the execute-observe-classify loop, run once per technique:

  1. Declare. Red operator announces the technique ID and expected observable artifacts (e.g., "Executing T1059.001 via Invoke-Expression with base64-encoded payload; expect Sysmon EID 1 with CommandLine containing -EncodedCommand").
  2. Execute. Red operator runs the technique. Timing is recorded.
  3. Observe. Blue operator searches SIEM/EDR for expected signals. Timer starts at execution time.
  4. Classify. Blue operator reports one of four outcomes:
    • Detected — alert fired, correct technique mapped
    • Detected — No Alert — telemetry present, detection logic exists but threshold or context filter suppressed the alert
    • Telemetry Present — No Detection — data is in the SIEM but no rule covers this technique variant
    • No Telemetry — required data source not flowing (should have been caught in Phase 2)
  5. Record MTTD. If detected, measure the time from execution to alert. If not detected, mark MTTD as gap.
  6. Iterate (optional). If the detection failed and time permits, blue operator drafts a detection rule on the spot. Red operator re-executes the technique to validate. This in-exercise iteration is the highest-value activity in purple teaming — it turns a gap into a closed detection within hours instead of months.

A single technique loop takes 10–30 minutes depending on complexity. A two-day exercise with eight-hour execution windows covers 16–48 technique executions, including re-runs for validation.

Phase 4: Gap Triage and Sprint Planning

After the exercise, every technique has a classification. Translate these into the backlog using the Part 4 gap triage framework:

ClassificationGap TypeTypical ResolutionPriority Signal
No TelemetryTelemetry gapDeploy missing data source (Sysmon, ScriptBlock Logging, CloudTrail)Always P1 — detection is impossible without data
Telemetry Present — No DetectionDetection gapAuthor new Sigma rule or SIEM-native detectionP1 if technique is top-10 prevalence; P2 otherwise
Detected — No AlertTuning gapAdjust threshold, add context filter, or fix allowlisting errorP2 — tuning is faster than authoring but risks false positives if rushed
DetectedNone (validated)N/A — update coverage map to validated statusN/A

The sprint plan follows the Part 4 emulation-to-sprint loop: receive the exercise report, classify gaps, prioritize by technique prevalence, execute a two-week detection sprint, and re-validate in the next purple team exercise.

Phase 5: Coverage Map Update

Every technique exercised gets its status updated in the organization's coverage map. The coverage map — introduced in Part 1 — tracks per-technique status across three states: detected, mitigated, gap. Purple team exercises add a fourth dimension: validated. A technique marked detected based on rule existence but never exercised is an assumption. A technique marked validated has been exercised against real adversary behavior and confirmed to produce an alert.

The maturity progression from Part 4 maps directly:

  • Level 1 — Ad-Hoc: Coverage map is aspirational; no exercises conducted
  • Level 2 — Mapped: Coverage map exists; rules written but untested against adversary execution
  • Level 3 — Validated: Purple team exercises have confirmed detection for exercised techniques; gaps are documented and prioritized
  • Level 4 — Continuously Validated: Purple team exercises run on a cadence (quarterly or per-ATT&CK-update); new techniques are validated within 30 days of mapping

Most organizations sit at Level 2. Moving to Level 3 requires one discipline: running the execute-observe-classify loop on a recurring basis.

Phase 6: Iterate

The fifth phase of the threat-informed cycle is iterate — and it is where purple teaming becomes a continuous practice rather than a periodic event. Three iteration triggers restart the lifecycle:

  • ATT&CK update — MITRE releases major ATT&CK updates twice per year (typically April and October). New techniques and sub-techniques invalidate parts of the coverage map. Each update is a trigger for a scoped exercise.
  • Threat landscape shift — A new threat group profile relevant to the organization's vertical (e.g., Volt Typhoon for critical infrastructure, Lazarus for financial services) demands a targeted exercise against that group's technique set.
  • Detection sprint completion — When a two-week sprint closes gaps from the previous exercise, the next exercise validates those closures. This creates a validate-remediate-revalidate cadence.

Running Purple Team Exercises: The Operational Playbook

Beyond the lifecycle, three operational considerations determine whether purple teaming produces lasting value or becomes another shelfware exercise.

Facilitation and Roles

Every purple team exercise needs a facilitator — someone who is neither executing techniques nor writing detections. The facilitator enforces the loop protocol, records observations and timing, and prevents the two most common exercise failures:

  • Scope creep — Part 3 identified this explicitly. A red operator discovers a new attack vector mid-exercise and chases it. The facilitator records the discovery for a future exercise but redirects back to the scoped technique list.
  • Debug drift — A detection fails and both operators spend 90 minutes troubleshooting the SIEM query grammar. The facilitator caps debugging at 15 minutes per technique; unresolved failures go to the backlog.

Role mapping:

RoleResponsibilityTypical Assignment
FacilitatorDeclare-execute-observe protocol enforcement, timer, scope guard, observation loggingSecurity architect or detection engineering lead (neutral party)
Red OperatorTechnique execution, artifact declaration, variant testingPen test team member or adversary emulation specialist
Blue OperatorSIEM/EDR observation, detection classification, on-the-spot rule authoringDetection engineer or SOC analyst
ObserverShadow and learn; ask questions during debriefSOC analysts, incident responders, threat intelligence analysts

Communication Protocol

The declare-execute-observe loop requires a structured communication channel. Two formats work in practice:

  • Co-located: A shared screen (SIEM dashboard) and verbal declarations. Red operator announces technique, blue operator queries in real time. Fastest iteration — 5–10 minutes per technique.
  • Remote: A dedicated chat channel (Slack, Teams) with a templated message format: [TECHNIQUE] T1059.001 | [VARIANT] EncodedCommand | [OBSERVABLES] Sysmon EID 1, CommandLine contains -EncCommand | [STATUS] Executing. Blue operator replies with classification. Slightly slower but works across time zones and creates an automatic log.

Avoid email-based coordination — the latency destroys the real-time loop that makes purple teaming effective.

Tooling Alignment

Offensive and defensive tooling must be aligned before the exercise. Mismatches waste hours:

Alignment CheckFailure ModeResolution
SIEM field names match expected detection rulesSigma rule references CommandLine but SIEM maps it as command_line; detection misfiresRun sigma-cli conversion with correct pipeline before exercise
EDR agent supports required telemetrySysmon EID 10 (ProcessAccess) not forwarded; T1055 detection impossibleDeploy missing Sysmon configuration or EDR sensor update in Phase 2
Red team tooling matches exercise fidelity tierAtomic Red Team test creates a benign signal (notepad.exe spawning from PowerShell); CALDERA or C2 framework creates authentic adversary tooling; mismatch on fidelity expectations causes confusionAgree on fidelity tier per technique before exercise: unit test (Atomic), integration test (CALDERA), or end-to-end (C2 + CTID plan)
Network architecture reflects productionExercise runs in flat lab network; production has segmentation, proxy, Zero Trust policies; detection rules validated in lab fail in productionUse production-adjacent environment or canary testing with limited-scope production execution

Measuring Purple Team Outcomes

Five metrics track the effectiveness of a purple team program over time:

MetricDefinitionBenchmark (2025)Target
Detection Coverage RatePercentage of exercised techniques that produced an alert24% industry average (Picus Blue Report 2025)>50% after first exercise cycle; >70% after three cycles
Mean Time to Detect (MTTD)Average time from technique execution to alert generationCrowdStrike 2026: breakout time is 29 minutes (down from 48 min)MTTD < breakout time for all high-prevalence techniques
Gap Closure RatePercentage of identified gaps closed within one sprint cycle (2 weeks)No industry benchmark; informal estimates suggest 30–40%>60% for P1 gaps; >40% for P2 gaps
Revalidation Pass RatePercentage of previously validated techniques that still produce alerts in subsequent exercisesNo industry benchmark>90% (regression below this signals rule decay or infrastructure change)
Exercise VelocityNumber of technique executions per exercise day10–15 techniques/day (co-located with experienced team)Stable or increasing; declining velocity signals process friction

The most revealing metric is the revalidation pass rate. A detection rule that passed in March but fails in June is a regression — typically caused by SIEM schema changes, log source outages, or adversary technique variance. Continuous purple teaming catches regression before a real adversary does.

Exceptions and Limits

Purple teaming is powerful within its domain but carries boundaries worth stating explicitly:

  • LotL techniques resist structured loops. Living-off-the-land techniques (T1059.001 PowerShell, T1078 Valid Accounts, T1046 Network Service Discovery) use legitimate admin tools. The declare-execute-observe loop still works, but detection requires behavioral baselining and contextual correlation — not a single-event Sigma rule. Blue operators must assess whether the observed behavior is within baseline or anomalous, which takes longer and introduces subjectivity. Budget 30–45 minutes per LotL technique instead of 10–15.
  • Cloud and identity layers have different observation models. Endpoint telemetry (Sysmon, EDR) provides rich per-event data. Cloud audit logs (CloudTrail, Azure Activity, Entra ID sign-in logs) are coarser — they record API calls, not process trees. A purple team exercise targeting T1078.011 Cloud Accounts or T1098 Account Manipulation requires the blue operator to work with log analytics (KQL, Log Analytics) rather than a SIEM correlation engine. The loop structure is the same; the tooling and query patterns differ.
  • Zero-day techniques are outside scope by definition. Purple teaming validates detection against known techniques. Novel techniques (those not yet in ATT&CK) require a different discipline — threat hunting informed by anomaly detection and threat intelligence. Part 6 in this series covers AI-augmented approaches to this problem.
  • Organizational readiness is a prerequisite, not a result. A team at detection maturity Level 1 (Ad-Hoc) cannot run a productive purple team exercise — they lack the telemetry, detection rules, and coverage map to classify observations. The progression is sequential: map controls to ATT&CK (Level 2), then validate via purple exercises (Level 3). Running purple exercises prematurely produces frustration and no useful output.
  • Exercise fatigue is real. Quarterly exercises covering the same technique set produce diminishing returns once coverage exceeds 70%. At that point, the exercise scope should shift to newly added ATT&CK techniques, threat group profiles not yet emulated, or cloud/identity layers not yet exercised. Stagnation kills program momentum.

Honest Assessment

DimensionSiloed Red + Blue (Traditional)Purple Team Operations
Feedback latencyWeeks to months (report delivery cycle)Minutes (real-time declare-observe loop)
Detection coverage visibilityEstimated from rule counts; unvalidatedMeasured per technique; validated against adversary execution
Gap classification accuracyInferred from post-exercise reconstructionClassified live with telemetry confirmed in Phase 2
Mean time to close a detection gap87 days average (Picus 2025)Hours for in-exercise iteration; 2 weeks for sprint-triaged gaps
Regression detectionNone — rules are written and forgottenRevalidation pass rate catches rule decay proactively
Organizational costLow per event; high cumulative (repeat testing without progress)Higher per exercise (dedicated facilitation, 2-day block); lower cumulative (each exercise moves the coverage map forward)

The trade-off is upfront investment versus compounding returns. The first purple team exercise costs more in coordination and facilitation than a traditional red team engagement. The third exercise costs less than a red team retest — because the coverage map has progressed, the scope narrows to net-new techniques, and the operating rhythm is established.

Actionable Takeaways

  • Run the declare-execute-observe loop before building anything new. The quickest win in purple teaming is confirming which existing detections actually fire against adversary execution. The coverage map correction from assumption to validation often reveals that 30–40% of "detected" techniques are actually tuning gaps.
  • Appoint a facilitator and protect the scope. Scope creep and debug drift destroy more purple team exercises than any other factor. A neutral facilitator with a timer and a technique list is the single highest-leverage role in the exercise.
  • Verify telemetry before executing techniques. Phase 2 is not optional. Running an exercise against missing telemetry generates "No Telemetry" classifications that should have been caught in pre-work. Every "No Telemetry" result is a wasted technique slot.
  • Track revalidation pass rate as the leading health indicator. Coverage at a single point in time is a snapshot. Coverage that stays stable across exercise cycles is a program. If revalidation pass rate drops below 90%, investigate immediately — rule decay is often the first sign of a SIEM or EDR configuration change that will affect real incident detection.
  • Transition from periodic events to a continuous cadence. One purple team exercise per year is a project. One per quarter is a practice. One per ATT&CK update cycle is a program. The iteration trigger model (ATT&CK update + threat landscape shift + sprint completion) provides a natural cadence that is event-driven rather than calendar-driven.

This is part 5 in a series on threat-informed defense. Start with part 1. Part 6 will cover AI-augmented threat profiling — using large language models and graph analysis to extend threat-informed defense beyond the known technique catalog.