Industry executives and experts share their predictions for 2021. Read them in this 13th annual VMblog.com series exclusive.
Data Management Challenges for 2021
By Noemi Greyzdorf, Director
of Product Marketing, Quantum
In 2020, data grew at an
unprecedented rate and this will only continue as we move into 2021. Everything
we do has become more digitized, and managing, storing and protecting the huge
volumes of valuable digital content created everyday will be a big priority. For
example, if you look at many segments within life sciences and some segments of
IoT, multiple terabytes of data are being created and ingested every day. The
problem companies face in 2021 is a lack of insight into, and understanding of,
what their data represents to their business and what value that data has today
and in the future.
Managing human error
Human behavior is one of the
biggest problems facing data management. We move data where it doesn't belong
and then have to search through millions or billions of files to locate it. As
human behavior is almost impossible to change, we need technology innovations
that will help us not only behave better, but to provide safety nets to protect
data when we misbehave.
Industries managing large
data sets require software to manage their assets. These asset management
platforms are very common in industries like media and entertainment that have
been managing large numbers of files for decades, and this is now an emerging
need in many digital-driven organizations.
At one level, we need systems
that routinely scan every file-based storage system, gathering all the
information and putting it in a central location so it's easily accessible. This
means compiling the metadata, structuring it so it's searchable, and providing a
view of what's in your environment.
The second level is applying the
knowledge gained through metadata to make your storage more efficient and
create operational gains. Assigning business tags or extensible attributes to
the data and keeping them in a structured format tied to the system where it
actually manages the storage resources in real time will help with your data
needs and keep it where it needs to be. Indexing and cataloging this data using
AI presents a new opportunity to enrich this valuable file data with additional
metadata, making it more searchable, more accessible and more reusable.
Automating data management
While humans interacting with
data presents a challenge, most data that's created and consumed today is done
so by applications.
Automating data management is
a necessary step and can be done by establishing standards by which the
application can tag data based on what process it's in, where it is in the
workflow, and what it needs to do with that data in the next step, or the next
hour, three days, five days, etc. The applications can use these data
identifiers, whether they're tags or a set of metadata variables, to make
decisions and control the data and storage resources more directly.
For example, the application
can send a call to the storage system based on those descriptors or tags to say
"I'm going to use this data at 10:05, put it into Flash because that's where
I'm going to need to consume it from", and the storage system then automatically
puts it into Flash. The application consumes it and tells the storage system to
put it back.
Optimizing resources
Organizations must ensure
their expensive resources are being optimized and that they aren't putting data
that doesn't require performance on expensive media. This means they need to
move data across storage tiers in a way
that's seamless, transparent, automated and in real time. This will enable
optimal utilization of their resources, which, in turn, will reduce the overall
cost of delivering storage services.
If data needs to be delivered
to the application quickly and at low latency, organizations need to use the
most expensive tier. To avoid waste, they must make sure that only the data
that needs to be there for the application to do its processing or analytics is
placed there, and that's it removed as soon as it's no longer needed.
Preparing
for the long term
Any company planning to
remain relevant needs to recognize the role archive data will play in their
long-term success and how data archiving strategies will evolve. Businesses
spanning a range of industries increasingly hold data that may be retained in
some digital format for 100 years or more, making long-term retention
necessary. These "100-year archives"-combining the capabilities of intelligent
data management software and high-availability, scale-out hardware-will be
required to cope with possibly exabytes of archive data.
The most cost-effective
solutions today for archive data use high-capacity tape robotic libraries in
local, cloud and remote locations. The fastest growing type of data centers are
Hyperscale Data Centers (HSDCs), which arguably represent the pinnacle of
modern archiving strategies - they consume an estimated 2% of the world's
electricity today, and are projected to reach 8% by 2030. Addressing the
unprecedented HSDC storage challenges of the future will require advanced,
easily scalable air-gapped tape architectures that can support erasure coding,
geo-spreading with exascale capacities, extreme reliability, and ironclad
cybersecurity protection.
100-year archives will
require intelligent active archive software incorporating a data or asset
catalog, smart data movers, data classification and metadata capabilities,
highly-scalable tape libraries, erasure coding and geo-spreading data across
zones in different locations for higher fault tolerance, redundancy and
availability.
This is the future of data
management in 2021 and beyond. Gaining an understanding of the value of your
data to your organization, using AI to index and add metadata to optimize
resources, and preparing for long-term archives will take data management to a
much better place.
##
About the Author
Noemi Greyzdorf is a 20-year veteran of the storage industry. She has worked to bring innovative technologies to market from product development, presales and sales, and marketing perspectives. Noemi has also provided guidance to large and small companies as an IDC analyst focusing on storage for unstructured data and virtualized infrastructures. Her analysis and go-to-market support were key in helping companies position, package and message disruptive technologies.