Industry executives and experts share their predictions for 2025. Read them in this 17th annual VMblog.com series exclusive. By Alberto Pan, CTO, Denodo
With 2025 forthcoming, companies will need to continue
to adapt to ever-evolving technologies and innovations impacting GenAI, data and
cloud services. Below, we will explore 3 key predictions for data managers in 2025,
offering forward-thinking insights into emerging trends and challenges that
companies will face as they look to continue to grow their businesses next year
and beyond.
Prediction
1: By 2026, more than 50% of companies
will identify data system distribution and heterogeneity as their primary
challenge in developing generative AI (GenAI)-ready data products.
The
2024 Gartner Technical Architect survey (1)
revealed that "data systems distribution across diverse platforms" was the
second most cited challenge when making data architecture decisions, with 56%
of architects highlighting it.
GenAI
applications must be able to access data across all enterprise systems in a
secure, governed manner, even when the data is dynamic and needed in real time.
However, current approaches to connecting GenAI applications with external data
sources-such as retrieval augmented generation (RAG)-overlook the complexity of
data distribution. Scaling GenAI applications beyond pilots and basic use cases
will necessitate solutions that directly address this challenge.
Companies
should consider logical data management platforms enabled by data
virtualization, to establish a unified, governed data layer for AI-driven data
products. Logical data management platforms enable real-time, unified access to
multiple data sources, providing a single point for enforcing consistent
security and data governance policies, and enabling data to be presented in the
language of the business.
Prediction
2: By 2026, over 80% of organizations
building centralized cloud data warehouses or data lakehouse architectures will
decide to migrate certain workloads to other environments, including
alternative data processing systems within the same cloud provider, systems in
other clouds, or even on-premises environments (data repatriation).
The
drive for data democratization and usage-based cloud pricing models has led to
soaring costs for many large organizations. Reflecting this trend, IDC's June
2024 report, Assessing the Scale of Workload Repatriation (2), found that around 80% of respondents anticipated
some level of data repatriation in the coming 12 months. Repatriation is
complex and costly, so organizations will also optimize costs by choosing the
cloud environment and system that offers the best balance of efficiency and
cost-effectiveness for each use case.
Companies
should invest in technologies that simplify migrating use cases to the
most appropriate environment as technology and business needs evolve. Open
table formats (OTFs) enable data representation that is compatible with
multiple processing engines. Additionally, logical data management platforms
shield data consumers from the nuances of individual processing engines,
including SQL dialects, authentication protocols, and access control
mechanisms.
Prediction
3: By 2026, more than 80% of
organizations will create critical data products using multiple data platforms.
This shift will pose challenges for enterprise-wide data democratization
initiatives in organizations that initially envisioned a single-vendor approach.
Data
product management initiatives are naturally distributed, as no single platform
can optimize functionality, performance, and cost across all data products.
Supporting this, fewer than 5% of joint Snowflake and Databricks customers plan
to decommission one of these platforms, with the majority also using additional
cloud and on-premises systems (3). In
addition, in federated governance models, data product owners often select
platforms that best meet their specific functional and budgetary requirements.
Moreover, with the pace of technological innovation accelerating, new data
platforms will continue to emerge.
Given
these dynamics, to ensure agility, consistency, and cost-effectiveness,
enterprise data product strategies must account for data distribution and
platform diversity.
Companies
should consider adopting logical data management approaches to establish a
unified infrastructure for publishing, securing, and accessing data products
across diverse platforms. Such an approach would provide data product owners
with the flexibility to select the most suitable system for their needs while
also enabling the interoperability, reusability, and straightforward discovery
of all data products at the global level.
##
ABOUT THE AUTHOR
Alberto Pan is Chief Technical Officer at Denodo and Associate Professor (now on leave of absence) at University of A Coruña. He has led product development tasks for all versions of the Denodo Platform. He has authored more than 50 scientific papers in areas such as data virtualization, data integration and web automation.
Sources:
Gartner
2025 Planning Guide for Data Management. Published on Oct, 14 2024
Assessing
the Scale of Workload Repatriation: Insights from IDC's Server and Storage
Workloads Surveys, 1H23 and 2H23
Why Databricks vs. Snowflake is not a zero-sum game.
https://siliconangle.com/2024/07/27/databricks-vs-snowflake-not-zero-sum-game/