Virtualization Technology News and Information
Article
RSS
Unstructured Data Management Trends - VMblog Expert Interview with Krishna Subramanian of Komprise

interview-komprise-subramanian 

VMblog recently reached out to and spoke with Komprise co-founder and president, Krishna Subramanian, to learn more about some of the trends taking shape in unstructured data management.

VMblog:  The tech world is always changing from quantum computing to sustainability technology to generative AI. How are the latest developments affecting the jobs of enterprise data storage managers and how IT teams manage storage? 

Krishna Subramanian:  The trends towards IT-as-a-Service coupled with increased interest in AI are causing enterprise storage teams to look for ways to manage data across storage vendors and deliver new and improved data services to business users. Most (85%) of IT leaders in the Komprise 2023 State of Unstructured Data Management say that non-IT users should have a role in managing their own data and 62% already have attained some level of user self-service for unstructured data management. 

Data storage professionals will need to focus on tighter collaboration with departments, such as through showback reporting, to cut costs by finding and tiering cold data and eliminating unnecessary duplicates. End users should be able to quickly search for the types of files they need and inform IT about their intentions so that IT can set policies for data movement - such as to a cloud AI service. Storage experts may need more business acumen to deliver upon high-stakes enterprise needs for data. 

VMblog:  GenAI is getting all the headlines right now. What new challenges and strategies does this create for storage and data managers? 

Subramanian:  If you can't see all your data, and understand key characteristics about it, it's impossible to make optimal decisions about its access, mobility, protection and storage. The advent of generative AI makes this shift even more imperative because GenAI introduces new ways to analyze and enrich data, but it also introduces new risks for data privacy, security, ethics, leakage and accuracy. Lowering these risks requires strong data governance programs, and data storage teams have a role to play here. The right tools can help data storage managers track data usage in AI programs, prevent leakage of unauthorized data types into AI and bring insights to the table in concert with their security, privacy and legal counterparts. There is also a need for employee education: workers must understand the requirements and risks of using generative AI and how to reap benefits from it without creating liabilities for their employer and customers.  

VMblog:  How hard is it to today to put guardrails around the use of GenAI? What strategies do you recommend today? 

Subramanian:  Companies and public sector organizations are already doing this to some extent; research, including the Komprise survey, shows that organizations are largely allowing employee use of GenAI and most have outlined some restrictions on data or applications. Yet there are limitations on guardrails due to the early, amorphous nature of the technology and a lack of understanding in how the tools work behind the scenes and what vendors are doing (or not) to protect organizations and their data. It's hard to fully control employee use, as with shadow IT. It's also hard to know which are the safest and most accurate tools to use, how to minimize risks, and how to consistently monitor and audit corporate data use. The best place to start is to create and enforce a comprehensive data governance framework that manages the Security, Privacy, Lineage, Ownership, and Governance (SPLOG) of data interactions with AI. Read our blog post here

VMblog:  What is missing in terms of new tools or features to help IT enable GenAI without bringing the house down from a data breach, lawsuit, sensitive data leak etc? 

Subramanian:  Given the multifarious threats from generative AI, it's hard to imagine a single governance solution that will fit the bill. Instead, there will be layers of AI security tools, starting at the network layer to prevent the access of blocked data by an AI tool or prevents users from sending corporate data to unauthorized AI services. There would be another level protection at the data layer which audits which data was moved, where, when and by whom and alerts if PII or sensitive data is being shared. Finally, there could be a security mechanism at the user layer that may warn users when they are engineering prompts with corporate or sensitive data or provides feedback when prompts may be giving away too much corporate context. 

VMblog:  In your recent survey on unstructured data management, cloud cost optimization came up this year as a higher priority than cloud migration. Can you explain that and how your customers are doing it? 

Subramanian:  We entered 2023 with enterprise customers seriously reevaluating their cloud spend and the big cloud service providers (CSPs) reporting declining or flattening revenue streams. After years of spending aggressively in the cloud, many enterprise IT organizations were reeling from huge, unexpected bills. Common tactics to avoid cloud waste include leveraging cost savings plans and other pricing promotions offered by the cloud vendors, using commercial spend monitoring tools, deleting duplicate and orphaned data and reducing cloud sprawl through automated discovery and corporate policies. An independent unstructured data management solution also helps by giving storage and IT managers a means to view and analyze data assets across all storage and establish automated tiering or migrating of data to the most cost-effective storage solution for current needs. This avoids data sitting endlessly on high-priced storage when it's no longer active.  

VMblog:  There's been a lot of hype about cloud file storage. How has this evolved in the last two years and what role does Komprise play? 

Subramanian:  The pandemic accelerated cloud infrastructure spending as a lifeline to rapidly resume normal business operations and rejigger product and service delivery to customers using online channels. Whether from waste, overprovisioning, high egress fees, lack of demonstrable ROI and/or not selecting the optimal cloud storage tier, high costs have resulted in organizations pulling back on cloud spending in the last year. In 2023, there's been a sharper focus on cloud cost optimization strategies. Komprise Intelligent Data Management helps by delivering a variety of metrics and reports to understand data growth, data usage and overall costs. IT can use those metrics to forecast the savings of switching to different types of storage and Komprise can automate policies to move data as it ages from top-tier file storage to lower cost, archival cloud object storage. 

VMblog:  Moving data without disruption was identified as a top challenge in the survey. Describe what exactly is a "disruptive" experience in data management and how does it affect both users and IT? 

Subramanian:  Traditionally, moving data meant changing user access or disrupting users and applications. A common problem with traditional archiving is when data becomes cold, and IT moves data (such as to an archive), the data becomes inaccessible from the original location. So, when a user goes back to find it and it's no longer in its original location, they must hunt around for it, put in a help request to IT, and meanwhile complain to their boss. The same scenario applies to applications that store file data: the app can break if the file is no longer accessible from the original location. Another issue is when applications like instruments cannot continue to write data to a file server when you are doing a migration cutover. While this downtime is often no more than a few hours, if it occurs at a critical time, it could negatively affect customers, operations and even safety-such as in patient care. Komprise helps resolve these issues with our patented Transparent Move Technology which moves data with no changes to user and application access. We also announced warm cutover capabilities in our latest release which eliminates migration downtime in situations that require it. 

VMblog:  What is the nirvana for self-service data management, another growing trend identified in your survey? 

Subramanian:  Self-service data management is about letting users understand their data, find what they need and contribute to data management decisions.  Komprise Smart Data Workflows is an example of how policy-driven automation can support many different needs: discovering, tagging and moving sensitive data to secure storage, finding and copying data for an audit or legal investigation, merging or deleting data assets after an acquisition, and finding the right data across storage silos to send to a cloud data lake for analysis. Self-service data management benefits IT and departmental users alike: storage managers can more readily meet goals for cost savings and compliance without conflicts and department managers achieve more say in how data is managed to meet business objectives. 

VMblog:  In your latest release you introduced Storage Insights to unify data and storage management. Why is this important now? 

Subramanian:  For storage managers, there's not been a single console to see detailed usage and capacity data on both storage and data assets. And it's not just being able to drill down into different directories and storage vendors but the ability to execute plans from the console. This is important now because unstructured data growth has exploded in recent years, creating massive strain on IT budgets and complexity plus increased security and compliance risks. Plus, storage managers are increasingly procuring storage from many different vendors. That's making it difficult to see trends to save money or manage capacity, performance and security more effectively for end users. Storage insights is something our customers have been asking for so that they can work more effectively and productively. But don't take it from us, industry analyst Steve McDowell remarked recently in Forbes: "Storage Insights is unique in the market in providing a holistic view of an enterprise's unstructured data across cloud boundaries, including data stored on-prem on nearly every storage vendor's solution. That's powerful." 

VMblog:  What industries have been the biggest adopters of unstructured data management solutions and why? How has this changed? 

Subramanian:  Most industries these days have huge volumes of unstructured data, yet some of the most relevant sectors today include healthcare, life sciences, state and local governments, higher education, oil & gas and manufacturing. These organizations are storing petabytes of data which is now growing exponentially while retention periods remain long. Unstructured data management solutions are invaluable for data-heavy organizations to regain visibility and control of their data assets that are now distributed across many data centers at headquarters, satellite offices and in the cloud. An independent unstructured data management solution can help customers know what data they have and what it is costing them no matter where the data lives. It can help optimize storage spending, right-place data into the most appropriate storage, avoid vendor lock-in, and provide insight on data that can help IT better serve their constituents - be they researchers, data scientists, executives, citizens, product developers or marketers. 

##

Published Friday, October 27, 2023 7:31 AM by David Marshall
Filed under: ,
Comments
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
Calendar
<October 2023>
SuMoTuWeThFrSa
24252627282930
1234567
891011121314
15161718192021
22232425262728
2930311234