Virtualization Technology News and Information
Article
RSS
Crux 2023 Predictions: Shift to Cloud to Unlock New Opportunities for DataOps

vmblog-predictions-2023 

Industry executives and experts share their predictions for 2023.  Read them in this 15th annual VMblog.com series exclusive.

Shift to Cloud to Unlock New Opportunities for DataOps

By Dan Lynn, SVP Product, Crux

A seismic event is taking place across enterprises and industries - analytical workloads are shifting from on-premises stacks to cloud data warehouses, with the trend set to continue in 2023 and beyond. It's not exactly new - data center operators have seen this trend gather momentum and watched it unfold. They understand how cloud data warehouses fit into the larger picture.

What may be less clear to operators is what other workloads will move to cloud data warehouses, such as applications, customer-facing analytics, machine learning, enterprise-wide business intelligence, and more. The answer may vary by organization and industry, and the timelines may extend months and years into the future, but DataOps professionals and data engineers should begin to prepare now for what's next, and these predictions may be helpful in charting a path.

1.      Data center teams will deliver value by excelling in external data preparation. As enterprises shift away from building large, on-premises stacks in favor of moving analytical workloads into cloud data warehouses, DataOps teams will be tasked to move data onto those systems, which could be a multiyear project. Replication is one aspect of the challenge, and another facet is finding a way for virtualized workloads to participate in that ecosystem in a positive way.

The workflow for preparing data is changing with the influx of external data. Spending on third-party data is exploding now that more enterprises are realizing the competitive edge it can provide, so onsite data managers may be tasked with standardizing data in a way that's useful for analysts. Data center teams that excel at preparation can make valuable contributions to a process that is increasingly central to business success.

2.      Integrated solutions will gain traction. Related to the point above about data preparation, in the coming year, as data center teams get more involved in the workflow of moving data to cloud warehouses, they'll require solutions to manage the quality, integration, replication, and other critical functions of incoming datasets. These capabilities will be necessary to manage the workflow, and there are individual point solutions that can handle designated functions.

But as teams pull in external data and use tools to ensure quality, integrate data, and replicate it, they'll have to manage disparate toolsets and also publish catalogs of the tools they use for analysts and data scientists. Using disparate tools for data quality, integration, and replication will be cumbersome to manage, which will drive a movement toward integrated solutions that enable enterprises to offer great data products in a managed catalog.

3.      Eliminating data bottlenecks will enable a self-service approach. Enterprises aim to offer high-quality, reliable data products to analysts, data scientists, and key decision-makers, but first, they'll have to get their teams up to speed and put the right tools and technologies in place. Take baking a cake as an analogy: you may start with internal and external data as raw ingredients, but they don't offer much value until you follow a recipe to refine them into a finished product. Data ingestion and discovery of patterns is the first step - this is like determining what ingredients you have and what recipe you can make with them. Then, as with baking, the data must be cleansed, formatted, and integrated to transform it into a consumable product. But arriving at this finished product can be a painstaking process, so the ideal future state is a "robotic kitchen" that automates these tasks. Similarly, as enterprises ingest larger volumes of data with the shift to cloud, tools that automate data transformation processes will allow DataOps teams to focus on creating value.  

Data transformation isn't a one-off process; it's ongoing as new information is ingested because the vast majority of data streams change constantly. Data managers need a way to monitor data streams for problems, and they need the skills required to conduct root cause analysis.

For example, if data is missing, or erroneously timed (e.g., data that is supposed to be sourced from yesterday that instead arrives from a week prior), or corrupted due to programming errors, etc., that results in skewed outcomes. Integrated solutions can help data managers identify bottlenecks and breakdowns and address them to ensure data quality over the long haul.

As analytical workloads continue to shift from on-premises assets to cloud data warehouses, it's clear that the future of DataOps is in the cloud. Cloud marketplaces are emerging to facilitate the exchange of data between organizations, and many enterprise leaders envision data as a profit center, but few are in a position to accomplish that with the tools they currently have.

Much work remains to be done to prepare a data catalog, and that work will require skilled teams and integrated solutions. Data center operators can partner with core engineering, application specialists, etc., to help their organizations reach a future state where data is a profitable product in a self-service cloud data marketplace.

##

ABOUT THE AUTHOR

Dan Lynn 

Dan Lynn is a serial startup founder and a Techstars graduate, bringing two decades of experience building data-centric software businesses. Dan joined Crux from Precisely, where he led product strategy and user experience for a complex portfolio of products in the data integration, data quality, data governance, and master data management markets.

Prior to Precisely, Dan led product and engineering at Hitachi Vantara, launching several new products in the industrial Internet-of-Things and cloud-native verticals. Previously, Dan was also CEO at AgilData, a high-performance structured streaming data management company, and founding CTO at FullContact, an enterprise identity resolution platform.

On the side, Dan invests in and advises early-stage companies on product and technology issues.

Published Friday, November 18, 2022 7:33 AM by David Marshall
Comments
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
Calendar
<November 2022>
SuMoTuWeThFrSa
303112345
6789101112
13141516171819
20212223242526
27282930123
45678910