Great Britain's cycling
team went from one of the worst-performing teams in the world to dominating the
Gold Medal count in the Beijing Olympics. Their breakthrough theory? Do all the
little things 1% better.
As we look towards making
wholesale improvements, maybe the key is to hone in on the seemingly subtle
things. It is a well-established fact that when measuring a Security Operation
Center's (SOC's) effectiveness, two primary metrics come to mind: Mean Time to
Detect (MTTD) and Mean Time to Respond (MTTR).
Here's the question: Can
making small improvements to these two foundational metrics really have a
significant impact on SOC team performance? If the history of British Cycling
is to be believed, then yes.
How to Measure SOC Team Performance
Every practitioner has a
general sense of how their SOC is performing, but to look at things
analytically, there is a formula that can be employed. To find out "SOC
Capacity," one needs to find out "Expected Work" - and then spot the gaps, if
any.
As Grant Oviatt, Head of
Security Operations at Prophet Security, notes, "SOC Capacity measures how much total
available time your team has to disposition security alerts," while expected
work is "the total amount of alert management work you expect in a given
month." Subtract one from the other, and you see how well your SOC is doing - or
isn't.
Calculate SOC Capacity
To figure out how much
time your SOC has to attend to business, you can generally assume that 70% of
their work time will actually go towards security-natured tasks, the rest being
given over to breaks, meetings, and perhaps even distractedness. Now, you
multiply those hours (minimized to 70%) by how many analysts you have, and
viola! You've got the total number of hours your team can "do work."
Calculate Expected Work
Now, you have to figure
out just how much work there is to do. Multiply the Mean Time to Respond (how
much time you're spending on alerts) by the total number of alerts (on
average), and you have that number, too.
Are you on track? Or at a deficit.
At this point, you're
ready to see how your SOC's capacity stacks up against the work it has to do.
Subtract Expected Work from SOC Capacity (or just compare the two) and see what
you're left with.
In a perfect world, your
SOC would have 15% more time than its Expected Workload. This means that even
in times of alert spikes, your team will be able to handle them. Unfortunately,
many teams will find themselves in a deficit, with Expected Work far exceeding
their poor team's capacity to do. For times like these, turn to managed
security providers, automation, and AI-driven assistance.
Take a good, hard look at
your metrics and see where you can cut back, improve, and tighten.
Analyzing MTTD (Mean Time to Detect)
Mean Time to Detect is
"the average time it takes to discover a security threat or incident." You can
figure out what yours is by subtracting the "Alerted At" time from the
"Activity Started At" time (of the earliest detection of the incident). This
should obviously be as close to "instantaneous" as possible, but we all know
that's a tough climb. However, a good rule of thumb is to stay within 0 and 4
hours.
While this seems
reasonable, when we're dealing with a cyber talent shortage, barrages of new
AI-generated malware, swamps of alerts, and complex technology that often ends
up as shelfware, these numbers can seem
like unreachable unicorns.
Keep reaching. There are
ways to bring these numbers down, and the first step is information. Find out
what is making your MTTD lag; the answer isn't always "more resources."
Sometimes, you can clean up and do more with less.
First, make sure what
you're doing is working. Are your detection investments catching a lot of
incidents? If so, keep investing there. If not, make some changes. Also, for a
more granular look, spend some time investigating a few samples after you successfully
chase down detected threats. What was the first indication of that threat? Is
there anything you can do to double down on detecting that behavior in the
future?
Enhancing MTTR (Mean Time to Respond)
Mean Time to Respond is
the average time it takes to remediate a threat and "get things under control."
This takes in:
- The time it takes to receive security
intelligence (telemetry)
- The time it takes to see and action on the
alert
- The time it takes to triage and investigate
- The time it takes to contain the threat (at
least initial containment)
All these factors (and
times) combined is going to make up how speedy your SOC is at responding to
threats, on average.Give yourself between 1 and 8 hours to pin these down; less
for critical incidents, more (is acceptable) for less-severe ones. However,
nothing, not even low-profile incidents, should exceed a full work day.
Again, the time should be
as close to zero as possible, and, again, that can seem like an impossible
dream.
But it doesn't have to
be. First, look for inefficiencies in the way you're doing things, and you may
be surprised. Are certain alert types generating faster responses? Have your
SOC lean into those (and find out why). Then, look into the individual steps
outlined above and see if any are taking an inordinate amount of time. What on
the outside may appear as "one big number" (and one you may not like) it is
made up of a lot of smaller metrics that can be identified, isolated, and
improved upon.
Conclusion
Mixed with AI-based
automation and technology, auditing your Mean Time to Detect and Mean Time to
Respond will bring some insight into your security strategy that you likely
didn't have before. Even if the eventual plan is to implement these additional
technologies (which would be wise, given the threat landscape), knowing your
MTTD and MTTR - their weak spots, strengths, and capacities - can help you
apply those technologies in the most efficient manner.
As cybercriminals double
down on offense, teams can find success by taking a fine-toothed comb to
defense. With an eye towards getting just 1% better, they can find hidden
opportunities to improve, notice slack that can be pulled in, and uncover ways
they can lower MTTD and MTTR times to optimize SOC team performance.
##
ABOUT THE AUTHOR
An ardent believer in personal data
privacy and the technology behind it, Katrina Thompson is a freelance writer
leaning into encryption, data privacy legislation, and the intersection of
information technology and human rights. She has written for Bora, Venafi, Tripwire, and
many other sites.