Virtualization Technology News and Information
The Power of Graph Analytics to Protect Your Data

By Howard Ting, CEO, Cyberhaven

Janice just saved an unreleased earnings report from the Company's CFO folder to her team's OneDrive. It is then downloaded by John, who emails it to Rebecca, who then shares it again and saves it to her personal cloud drive. Organizations are largely defined by how they can leverage their most important data, and are tasked with balancing ease of use - and collaboration - with security. Suppose you want to find where all the copies and derivatives of this data are in your enterprise. Or, you want to ensure that no data from the CFO folder is being uploaded to risky or unapproved locations (such as Rebecca's personal cloud). Traditional Data Loss Prevention (DLP) tools fall short, but tracing data flows on a graph that connects all events for each piece of data can follow Janice, John and Rebecca as they move a key piece of information around the company, and eventually outside the company.

The Evolution of Data Protection

Traditional DLP tools had two main strategies for identifying and controlling data: content-based signatures and content tagging. Many DLP products try to apply signatures to identify sensitive content, but unlike an antivirus signature that is designed to detect a very specific threat, organizations cannot write a signature for every single document that they want to protect. Instead, they need signatures to apply to an entire class of documents, which may differ enough that it is almost impossible to define signatures that will apply to every file. If signatures are too specific, they will fail to detect sensitive information (false negative) and if they are too broad, they will incorrectly flag non-sensitive content (false positives). Both options are unacceptable.

This problem is compounded as organizations try to apply data protections to more diverse types of data and grows even more complicated as users edit and modify content, making it harder for signatures to keep up. Ultimately, this means signatures are only able to protect a small minority of the most predictable, structured enterprise data, such as credit card numbers, while everything else remains unprotected.

Tagging allows security teams to add specific identifiers or tags to sensitive assets. Security staff or the end-users themselves can apply tags to content to track and control where they go, but tagging only protects data that security knows about and preemptively applies tags to. Other copies of the same data may not be tagged and controlled. Additionally, end-users may make mistakes when applying tags or forget to apply them altogether. Users may inadvertently or intentionally remove tags to prevent the data from being tracked. Data can be converted into file formats that don't support tags such as CSV files.

A Better Approach to Enterprise Data

Applying graph analysis to this problem changes all of this. It can be thought of as an omniscient system of cameras that tracks and correlates the movement of every action within an organization. Unlike the traditional signature and tagging-based approaches that have dominated DLP for years, leveraging graph analysis enables organizations to see and control their data and risk in a new light. Instead of just being defined by its bytes, graph analysis of data flows lets us understand the value of data in a business context.


With this sort of omniscience security teams can see the full story behind any and every piece of data and all its derivatives in the enterprise without having to do any upfront tagging or signature development. The graph builds the context automatically. Graph analysis can be applied passively to any data even without having to inspect the content itself. Likewise, since graph analysis will know where data is coming from, we can selectively inspect data from a corporate source while ignoring data coming from an employee's personal online banking portal. This can be extremely powerful in that it allows organizations to apply strong controls without having to actually expose sensitive content for analysis or running into potential data privacy issues.

Graph analysis is uniquely powerful for data protection because instead of looking at a piece of data at a single point in time, it can show the flow of how data is created, shared, transformed, and consumed through its entire history. For these reasons, graph analysis has the potential to transform enterprise data protection.




Howard Ting joined Cyberhaven as CEO in June 2020. In the past decade, Howard has played a critical role in scaling Palo Alto Networks and Nutanix from initial sales to over $1B in revenue, generating massive value for customers, employees, and shareholders. Howard has also served in GTM and product roles at Redis Labs, Zscaler, Microsoft, and RSA Security.
Published Thursday, June 30, 2022 7:32 AM by David Marshall
Filed under: ,
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<June 2022>