Industry executives and experts share their predictions for 2022. Read them in this 14th annual VMblog.com series exclusive.
Data Management in 2022 - ML → DB, Data Meshes, and Data Lakes
By
Anil Inamdar, the VP & Head of Data Solutions at Instaclustr
Get
ready for a big year in data management. 2022 is going to be about machine
learning driving database indexing and analytics, the expanding tentacles of
data lakes and the proliferation of the "data mesh."
Here
are three data management predictions heading into the new year:
Database management will find new
ML-powered paths to optimization
Machine
learning and predictive analytics are coming to database management, helping
enterprises burst through traditional limitations set by inflexible data design
and data usage trends humans can't foresee. Database admins, once saddled with
the unenviable task of producing optimized and performant queries based on
imperfect knowledge, will get welcome relief from ML solutions that can intuit
where data resides using reliably predictive models. This capability will go
further, with ML creating entirely optimized data indexes and automatically
handling reindexing and storage management. Whereas AIOps (similarly ML-powered
solutions for operations and predictive maintenance) shows some signs of
sputtering as a much-anticipated technology, predictive database management
should find the brighter destiny as a crucial component of any database
operations strategy once its training sets are appropriately refined.
The landscape around data lakes
will see sprawling growth
Data lake adoption shows no signs of
slowing, and that growth will further contribute to the vibrant ecosystem of
data integration solutions popping up around this dominant modern data storage
technology. These lakefront properties (if you will) will flourish as enterprises
seek to ensure that data lakes can harness and provide benefits based on the
entirety of their data. Specifically, technologies like open source Apache Kafka and Pulsar will enable organizations
to integrate data from third-party solutions and production workloads featuring
real-time transactions. For services that call for active data awareness,
options such as Debezium and Kafka Connect will facilitate that necessary data
lake connectivity.
Data mesh will hand data management responsibilities to its closest users
"Distributed" isn't just for
architecture anymore. In 2022, responsibility for data itself will increasingly
become decentralized and distributed to the individual teams that understand it
best. This "data mesh" approach enables benefits such as self-service data
access and more efficient data management from an organizational perspective.
Vendor products already tout the term in sales materials, so expect data mesh
to soon become ubiquitous - both as a term and, more importantly, as a
practical data control strategy.
No doubt about it, 2022 will be an
exciting one for data management. And these three predictions don't even take
into account the 2021 trends that I expect will continue - such as open source
data technologies gaining enterprise adoption against more restrictive open
core solutions.
##
ABOUT THE
AUTHOR
Anil Inamdar is the VP
& Head of Data Solutions at Instaclustr, which operates and supports customers' data infrastructure using
open source technologies. Anil has 20+ years of experience in data and
analytics roles. Joining Instaclustr in 2019, he works with organizations to
drive successful data-centric digital transformations via the right cultural,
operational, architectural, and technological roadmaps. Before Instaclustr, he
held data & analytics leadership roles at Dell EMC, Accenture, and Visa.
Anil lives and works in the Bay Area.