Does Your Defense Actually Work? Here Is How to Find Out

Annual penetration test

✗Point-in-time snapshot of a dynamic environment

✗Does not confirm whether detections fire

✗Does not test analyst execution under pressure

✗Generic scope — not adversary-specific

✗Outdated before the report is finalized

✗11-month gap between assessments

Continuous validation program

✓Ongoing detection efficacy testing against specific TTPs

✓Purple team: offense teaches defense in real time

✓Summiting the Pyramid scores detection robustness

✓BAS provides automated continuous validation

✓Adversary-specific scope driven by threat intelligence

✓Synchronized with current threat landscape

The military does not maintain unit readiness with an annual evaluation and a twelve-month gap. It runs a continuous training cycle: individual skills, small unit exercises, large-scale training rotations against realistic Opposing Forces, and after-action reviews that feed the next cycle. The National Training Center exists specifically to give units the experience of operating against a realistic, capable adversary before they encounter one in actual combat. Gaps are found in the exercise — not in the operation.

Purple team — the NTC for security operations

Purple teaming is the security equivalent of an NTC rotation, and the distinction from a standard red team engagement is important. This is not adversarial. It is collaborative. The red team executes specific, documented adversary TTPs against the real environment. The blue team attempts to detect and respond. Both teams work through the exercise together — the red team explicitly teaching the blue team what they are missing in real time. The output is not a vulnerability list. It is a detection efficacy map: here is where our visibility begins, here is exactly where it fails, here is the specific detection logic we need to build against this technique.

Summiting the Pyramid (MITRE Center for Threat-Informed Defense · December 2024): Applied to validation, this methodology scores each detection rule by its behavioral dependencies — what specific artifacts or behaviors does this rule require to fire, and how resistant is it to adversary evasion? Purple team findings mapped through Summiting the Pyramid tell you not just what you missed, but how structurally robust your detections are against an adversary who adapts. A detection that fires on a specific hash will be bypassed tomorrow. A detection scored high on the Summiting the Pyramid scale catches the same adversary technique regardless of which tool they use to execute it.

Breach and Attack Simulation — continuous, not periodic

Breach and Attack Simulation extends purple teaming to continuous automated validation. Not quarterly exercises but ongoing testing of whether your detection rules fire against specific techniques as your environment changes and as new adversary techniques are documented. When a new technique is attributed to a threat actor targeting your sector, you should be able to test your detection coverage against it within days — not at next year's assessment. BAS is not a replacement for human-led purple teaming. It is the continuous layer that keeps your detection posture synchronized with the actual threat landscape between exercises.

The adversary is also using automated evasion testing — probing detection environments before executing operations, modifying their approach when something triggers an alert. If your detection rules are based on historical signatures, they are being mapped and bypassed before the operation begins. Behavioral detection built at the top of MITRE's Pyramid of Pain resists this — because an adversary cannot change their fundamental operational methodology without rebuilding from scratch. Continuous validation confirms you are building at the right level.

The distinction that matters: A red team finds vulnerabilities. A purple team finds gaps in your ability to detect and respond to the specific adversary most likely to target you. Both are valuable. They answer fundamentally different questions. Organizations that substitute one for the other are answering the wrong question about their security posture.

MITRE ENGAGE — adversary engagement and deception

MITRE ENGAGE maps the adversary engagement layer that most organizations have not yet built into their validation program: denial, deception, and adversary manipulation. Where ATT&CK maps what the adversary does and D3FEND maps defensive countermeasures, ENGAGE maps how defenders can actively shape adversary behavior — using deception to slow, misdirect, and gather intelligence on adversary operations in the defender's own environment. This is the offense side of validation: not just confirming that defenses catch adversary techniques, but confirming that the environment can actively engage the adversary when they arrive.

The commander's estimate — the deliverable that matters

The output a mature validation program produces is not a vulnerability list. It is a commander's estimate for the security program: here is our detection coverage against our primary adversary's most likely course of action, here is where we are exposed, here is the prioritized investment to close those gaps, and here is what that investment buys us in terms of reduced specific risk. That is a defensible, prioritized security posture — not a compliance score. It is the answer to the question every board should be asking: if our primary adversary executed their most likely course of action against us today, where would we see them, and where would we be blind?

Validation is not an event. It is the continuous practice of finding out whether your defenses actually work — before your adversary finds out for you. The military found this out building the NTC. Enterprise security is still treating the annual pentest as the answer to a question it was never designed to answer.

Does your defense actually work? Here is how to find out.

Purple team — the NTC for security operations

Breach and Attack Simulation — continuous, not periodic

MITRE ENGAGE — adversary engagement and deception

The commander's estimate — the deliverable that matters