Virtualization Technology News and Information
Dremio 2023 Predictions: Data Lakehouses Offer Clear Strategy for Business Growth Amid Stormy Economy


Industry executives and experts share their predictions for 2023.  Read them in this 15th annual series exclusive.

Forecast 2023: Data Lakehouses Offer Clear Strategy for Business Growth Amid Stormy Economy

By Ben Hudson, Product Manager at Dremio

In the past year, across different industries and global markets, enterprises have invested heavily in data strategies that streamline business processes, drive revenue, and enable innovation, while reducing operational costs.

The data lakehouse, which enables companies to run data warehousing workloads directly on the data lake, is a 2022 breakthrough that provides the architectural foundation for companies to achieve this. Fundamental to that breakthrough are table formats such as Apache Iceberg and Delta Lake, which make it easy for teams to transform and analyze data quickly inside the lake without having to worry about how data is physically optimized.

Here are some trends we can expect to see in 2023 as companies turn to data lakehouses as a new, open architecture for analytics, and look to use their data more efficiently to grow their business.

Expensive vendor lock-in will be left out in the cold

Being locked into expensive proprietary systems is less appealing than ever, amid continuing economic uncertainty. No company can budget well or easily try new technologies while vendors hold data hostage.

In 2023, more companies will seek an open lakehouse architecture that allows them to reclaim ownership of their data (where data is stored in open formats and standards in their own account) and run analytics workloads on their data using any processing engine.

Automation is on the rise

While data lakehouses have gained considerable traction as a data management architecture, lakehouse file management is still a tedious task for data engineers. For example, how do you physically organize data for optimal data access? How do you efficiently evolve table partitions over time to support various query patterns?

In 2023, we'll see more lakehouse companies automate file management processes, which will make data engineers' lives much easier.

Semantic layers will get a facelift

Regardless of data architecture, companies still have major problems to tackle at the consumption phase of the analytics workflow. One notable issue is inconsistent cross-organizational reporting due to a lack of consensus on key business metrics. For example, what does it mean to be a paying customer? Is everybody calculating revenue the same way? Inconsistency arises because data consumers define their own business logic, metrics, and calculated fields within isolated BI tools, rather than leveraging agreed-upon definitions.

Companies aim to solve this issue by building a semantic layer, which provides a single, governed view of key business metrics for data analysts and data scientists to deliver consistent reports regardless of consumption tool. While not a new concept, they're experiencing a resurgence that will continue throughout 2023, fueled by an increasing need to provide data consumers with fast, reliable self-service analytics.

Customer engagement platforms will be built directly on the lakehouse

Data warehouses and lakehouses aggregate a wealth of data-from web analytics and marketing engagement to purchasing patterns and customer success metrics-to form a comprehensive, 360-degree view of customers. However, in most cases, companies can't use this unified data to engage with customers unless it's moved into a separate, use-case-specific platform (which has its own storage layer). The pain of managing pipelines to move data between systems is well known.

In 2023, more operational tools and customer engagement platforms will announce direct integrations with data warehouses and lakehouses, so companies can act upon 360-degree customer data without data movement. New startups that develop these tools will be built directly on the warehouse or lakehouse, minimizing data pipelines.

Data lakehouses have emerged as the most efficient data management architecture to support analytics and they provide the foundation for exciting innovation for the analytics workflow downstream. Staying abreast of trends like these in 2023 and using them to inform data strategies will help businesses weather economic conditions in the coming months.




Ben Hudson is a product manager at Dremio, where he leads strategic go-to-market initiatives. Prior to Dremio, Ben worked at IBM, where he led product management for their cloud data warehouse offering. He holds bachelor's and master's degrees in computer science from Wesleyan University, where he did research in programming language theory.

Published Wednesday, January 25, 2023 7:33 AM by David Marshall
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<January 2023>