By Howard Ting, CEO, Cyberhaven
Janice just saved an unreleased earnings
report from the Company's CFO folder to her team's OneDrive. It is then
downloaded by John, who emails it to Rebecca, who then shares it again and
saves it to her personal cloud drive. Organizations are largely defined by how
they can leverage their most important data, and are tasked with balancing ease
of use - and collaboration - with security. Suppose you want to find where all
the copies and derivatives of this data are in your enterprise. Or, you want to
ensure that no data from the CFO folder is being uploaded to risky or
unapproved locations (such as Rebecca's personal cloud). Traditional Data Loss
Prevention (DLP) tools fall short, but tracing data flows on a graph that
connects all events for each piece of data can follow Janice, John and Rebecca
as they move a key piece of information around the company, and eventually
outside the company.
The
Evolution of Data Protection
Traditional DLP tools had two main strategies
for identifying and controlling data: content-based signatures and content
tagging. Many DLP products try to apply signatures to identify sensitive
content, but unlike an antivirus signature that is designed to detect a very
specific threat, organizations cannot write a signature for every single
document that they want to protect. Instead, they need signatures to apply to
an entire class of documents, which may differ enough that it is almost
impossible to define signatures that will apply to every file. If signatures
are too specific, they will fail to detect sensitive information (false
negative) and if they are too broad, they will incorrectly flag non-sensitive
content (false positives). Both options are unacceptable.
This problem is compounded as organizations try to apply data protections to
more diverse types of data and grows even more complicated as users edit and
modify content, making it harder for signatures to keep up. Ultimately, this
means signatures are only able to protect a small minority of the most
predictable, structured enterprise data, such as credit card numbers, while
everything else remains unprotected.
Tagging allows security teams to add specific
identifiers or tags to sensitive assets. Security staff or the end-users
themselves can apply tags to content to track and control where they go, but
tagging only protects data that security knows about and preemptively applies
tags to. Other copies of the same data may not be tagged and controlled.
Additionally, end-users may make mistakes when applying tags or forget to apply
them altogether. Users may inadvertently or intentionally remove tags to
prevent the data from being tracked. Data can be converted into file formats
that don't support tags such as CSV files.
A
Better Approach to Enterprise Data
Applying graph analysis to this problem
changes all of this. It can be thought of as an omniscient system of cameras
that tracks and correlates the movement of every action within an organization.
Unlike the traditional signature and tagging-based approaches that have dominated
DLP for years, leveraging graph analysis enables organizations to see and
control their data and risk in a new light. Instead of just being defined by
its bytes, graph analysis of data flows lets us understand the value of data in
a business context.
With this sort of omniscience security teams
can see the full story behind any and every piece of data and all its
derivatives in the enterprise without having to do any upfront tagging or
signature development. The graph builds the context automatically. Graph
analysis can be applied passively to any data even without having to inspect
the content itself. Likewise, since graph analysis will know where data is
coming from, we can selectively inspect data from a corporate source while
ignoring data coming from an employee's personal online banking portal. This
can be extremely powerful in that it allows organizations to apply strong
controls without having to actually expose sensitive content for analysis or
running into potential data privacy issues.
Graph analysis is uniquely powerful for data
protection because instead of looking at a piece of data at a single point in
time, it can show the flow of how data is created, shared, transformed, and
consumed through its entire history. For these reasons, graph analysis has the
potential to transform enterprise data protection.
##
ABOUT THE AUTHOR
Howard Ting joined Cyberhaven as CEO in June 2020. In the past decade, Howard has played a critical role in scaling Palo Alto Networks and Nutanix from initial sales to over $1B in revenue, generating massive value for customers, employees, and shareholders. Howard has also served in GTM and product roles at Redis Labs, Zscaler, Microsoft, and RSA Security.