Industry executives and experts share their predictions for 2021. Read them in this 13th annual VMblog.com series exclusive.
Cloud Native, Hybrid, Unified File and Object Data
By Paul
Speciale, Chief Product Officer, Scality
From the inflection point of containerization
and the unification of file and object storage to the adoption of service mesh
and the advent of lower-cost, higher-density flash media, Paul Speciale, Chief
Product Officer at Scality, shares his predictions for the data storage
industry in 2021.
Cloud-native
applications and storage infrastructure will dominate enterprise IT
According to IDC, by 2023 over 500 million digital apps and
services will be developed and deployed using cloud-native approaches - that's the
same number of apps developed in the last 40 years. Most of these apps will be
targeted at industry-specific digital transformation use cases. This explosion
of new digital apps and services will define the new minimum competitive
requirements in every industry. The world of IT is rapidly turning to
cloud-native and containers as the new principle and model for application
development and underlying cloud infrastructure services.
For the storage industry, the container trend
is a major inflection point that will radically transform solution
architectures and deployments. Its impact will be similar to that of server
virtualization nearly twenty years ago and the beginning of cloud computing ten
years ago. In 2021 storage vendors will adapt to this change in application and
cloud infrastructure models by creating solutions to address the increasing
scale and agility demands of container-based services.
Containerized applications will require a
variety of storage classes to meet a range of requirements from boot volumes
and logs, transactional databases, application data over traditional file and
new object APIs, as well as backup and long-term archives. New container-centric
storage solutions will emerge to converge Container Storage Interface
(CSI)-type persistent volumes for traditional data centric applications, as
well as object storage and backups. This will drastically reduce the complexity
of large-scale Kubernetes deployments since these workloads typically represent
80% of storage needs as measured by capacity. This new generation of data
storage will help simplify the new and unique challenges of cloud-native
applications.
Object
storage will become a de facto storage model for data lakes
Research and Markets estimates that data lakes
will grow into a $20.1 billion market by 2025. We see this growth happening now
within our own customers' infrastructures - armed with insights from data
lakes, insurance companies are optimizing premiums, financial services institutions
are fighting fraud and bio-pharma companies are sequencing genes. To fully take
advantage of these insights, organizations require a foundational storage layer
that makes data accessible and useful.
Object storage will become the dominant storage
interface for analytics applications such as Elastic, Cloudera, Spark, Splunk,
Vertica, Weka and many others. Here's why:
- These apps have the ability to
consume data directly using Hadoop-compatible protocols such as S3A, thereby
leveraging the popular AWS S3 API.
- Semi-structured and unstructured
data sets that include image, audio, PDF, design/CAD files, etc. fit
naturally into an object store but would be quite unwieldy in a database.
- The flat (non-hierarchical) and
unlimited object namespace makes it easy for analytics applications to manage
billions of objects to effectively unbounded levels, without concern for limits
to the number of directories or files per directory.
- Object storage solutions decouple
storage tier from the application compute tier hosting the analytics
application. This enables performance and capacity resources to scale
independently unlike legacy systems where the two are tightly coupled, forcing
them to be scaled in lockstep.
- Features such as object lock and
versioning are essential measures to protect the valuable information stored in
data lakes against the ever-increasing number of cybersecurity attacks.
Hybrid
cloud data management will be broadly embraced
The promise of the hybrid cloud model is that
it provides enterprises with the best of public clouds (on-demand scalability,
rich services and agility) and the advantages of on-premises infrastructure
(privacy, security, performance and control). For data-centric applications,
there are tremendous benefits to be gained in accelerated time-to-market,
increased data access and lower Capex and Opex. Use cases for data span from
cloud data archiving to data bursting to business continuity (disaster recovery
for data, specifically).
Disaster Recovery (DR) across two physical
data centers will no longer be required in 2021. With hybrid cloud DR solutions
managing synchronized copies of critical data on-premises and in the public
cloud, IT leaders will save thousands, if not millions, of dollars without the
costs required to maintain and service two remote locations for DR.
This forecast is a shift from our thinking in
2020 when many of us predicted that hybrid cloud would primarily be deployed to
burst compute resources across on-premises infrastructure to public clouds. The
truth is that without a uniform cloud framework, managing compute instances and
applications across hybrid environments is very complex.
Increasing adoption of flash for
high-capacity storage
The new year will see a new generation of
high-density flash storage that will deliver an optimal combination of high
performance and lower prices, making it suitable for scale-out high-capacity
file and object storage.
Flash storage is now embraced for smaller
capacity applications such as in mobile, new edge computing uses and also for
latency-sensitive use cases. However, in the world of high-capacity storage,
high-density spinning disk (HDD) has been the main media for storing data over
the last decade. Easily able to manage hundreds of terabytes to multiple
petabytes of data, scale-out storage is ideal for storing massive quantities of
large documents, media files, videos for streaming and industry-specific data
like medical images. With the advent of lower-cost, higher-density flash media
in 2021, these use cases will naturally gravitate toward capacity-optimized
storage that takes advantage of this media to provide benefits in density,
scale and agility for multiple workloads.
Increasing
convergence of object and file storage for unstructured data
Enterprises have incurred the pain of data
silos for too many years. As data grows, the need for data storage that scales - both in capacity and in the breadth of
applications that it supports - becomes an increasing priority.
According to IDC, 80% of worldwide data will be
unstructured by 2025. This is non-record-oriented content that we all engage
with on a daily basis - think documents, video, images and audio files. For
decades, enterprises have deployed applications that access file system
storage, and these types of applications will continue to be important for the
foreseeable future.
Cloud-native applications more naturally
consume and interact with storage over APIs, for which the de facto model is
now object storage over popular interfaces like the AWS S3 API. For this
reason, I predict storage systems that combine file and object models into
single unified systems will become dominant in the enterprise starting in 2021.
Adoption
of the service mesh to connect and secure workloads
Complex, distributed applications are being
deployed across cloud regions, on-premises core data centers and edge locations
in existing virtual machines and new container-based services. Secure
communication between these services and controlled access to storage services
- for both regular operations as well as side-channels for troubleshooting -
becomes increasingly challenging. The traditional zone-based model using
network firewalls proves to be too binary and doesn't enforce secure protocols. Furthermore, a sudden increase in remote
work and use of mobile personal devices is no match for legacy network/firewall
designs.
2021 will bring about increased adoption of
‘service mesh' approaches to secure network communication (enforcing Transport
Layer Security (TLS)), and authentication and access control (using, for
example, mutual TLS) for both workload connectivity as well as toward the edge.
Unlike traditional host-based authentication credentials that were long-lived,
we now move towards service instance identities with short-lived service
accounts and delegation and more granular authorization using frameworks like
SPIFFE. This gradually introduces ‘zero-trust' networks (spearheaded by
Google's BeyondCorp approach) where network policies can be codified and
systematically enforced/deployed at fine granularity.
##
About the Author
Paul Speciale is
Scality's Chief Product Officer. Paul leads Scality's global marketing
organization across both product and corporate marketing. Paul's
experience spans 20+ years of industry experience in both Fortune 500 companies
such as IBM and Oracle as well as several successful startups. Before Scality,
he was fortunate to have been part of several exciting cloud computing and
early-stage storage companies, including Appcara, where he was focused on cloud
application automation solutions; Q-layer, one of the first cloud orchestration
companies (the last company acquired by Sun Microsystems); and Savvis, where he
led the launch of the Savvis VPDC cloud service. In the storage space,
Paul was VP of Products for Amplidata (acquired by Western Digital) focused on
object storage, and Agami Systems, building scalable, high-performance NAS
solutions.