Virtualization Technology News and Information
VMblog's Expert Interviews: MapR Technologies Talks Data Fabric - Separating Fact from Fiction


The concept of a data fabric has emerged to describe a new approach to support agile development of data driven applications, analytics, and AI.  There is, however, much confusion in the market created by a proliferation of different approaches, all describing themselves as data fabrics.  A true data fabric provides the foundation for the next generation of applications and supports AI, analytics, and IoT.  It must also scale, perform, and be reliable as well as function on-premises, across clouds, and at the edge.

I recently spoke with Bill Peterson, VP Industry Solutions at MapR to discuss the evolution of the data fabric.

VMblog:  Are enterprise organizations still struggling with taking advantage of data?

Bill Peterson:  Yes, absolutely. In fact, data silos not only limit an organization's ability to gain meaningful insights but also increase the cost of storing and processing data. It's critical for organizations to unify data silos while providing a unified view of all data - this is a major struggle today. Simplified data management allows you to seamlessly store and access all data across edge, on-premises, and cloud deployments. Several major impediments keep most organizations from taking full advantage of their data. These impediments include: lack of consistency, performance degradation, support for multiple data types and more. New technology is making possible the creation of a modern global data fabric which is able to radically modernize an organization's data management strategy.  Data fabric technology can also unlock the business value of all data (historical, operational, streaming) to directly drive transformation of the business in a more compelling way. MapR has a unique vision and technology to create such a data fabric while also operationalizing the data for business impact.

VMblog:  Describe what you mean by a data fabric.

Peterson:  From our perspective, an enterprise data fabric connects all data across multiple locations at all times. It delivers a uniformity of access to the actual data for operations and analytics regardless of the amount of distributed data. The data fabric is a new concept that has been gradually emerging as the needs of modern analytics and AI applications have been better understood. Like the operating system, the database, application server and web server, the data fabric is a layer that is forming because a wide variety of applications need it. A data fabric is the foundation that helps organizations reduce costs and drive innovation by making all data available at all locations. A data fabric allows organizations to create next-generation AI and analytics applications that increase revenue, efficiency, and manage risk.

VMblog:  And what is driving the need for such technology?

Peterson:  The growth in unstructured and streaming data is explosive. This growth holds great promise, if it can be managed. From high-resolution sensors and smart applications to Industrial IoT devices, there are more data sources emitting more data than at any time in history. The issue is that the diversity of data types is causing new silos to appear for specific kinds of processing and thus creating islands of insights which are not easily applied to operational use. Without a mechanism to collect, analyze and apply the results to operational systems, much of the value is lost.

VMblog:  Can you talk about and explain MapR's approach to data fabric?

Peterson:  MapR data fabric unifies data silos under one platform while provide a unified view of all data. New data formats and delivery mechanisms can make combining all historical, operational and streaming data in one platform very difficult. Let's look at one example of how we unified a new data format - containers. Containers have made it harder in some cases to share data across an organization because if that container goes away, the data running inside of that container goes with it. This is where the data fabric comes in. We can scale it to manage the data regardless of where it's located, even if a container is no longer running. A data fabric architecture is designed to allow an organization access to resources across different topologies. It acts as a management console that can see into what are traditionally different data silos.

We recently extended our advanced containers integration into the MapR Data Platform providing persistent storage for containers and enabling the deployment of stateful containerized applications. The MapR Data Fabric for Kubernetes addresses the limitations of container use by providing easy and full data access from within and across clouds and on-premises deployments. Now stateful applications can easily be deployed in containers for production use cases, machine learning pipelines and multi-tenant use cases.

VMblog:  Finally, what do organizations need to know about data fabric technology?

Peterson:  Look past marketing hype and the repackaging of old technologies. For example, ETL vendors offer integration and data federation tools that define the data flows from sources and destinations as data fabrics. Storage vendors market data fabrics that extend traditional storage networks. Virtualization vendors also extol data fabric solutions. Is every solution that provides a Hadoop distribution a legitimate data fabric? One way to separate the contenders from the pretenders is to review the technical capabilities a data fabric must have in order to reduce complexity and enable agility. Without these capabilities, a data fabric is limited in its ability to scale, stretch across locations, meet performance levels and ultimately drive business value. Finally, be wary of vendors whose data fabric is defined by marrying together a number of their products - you'll end up with more silos.


William "Bill" Peterson is VP Industry Solutions for MapR. Prior to MapR, Bill was the Director of Product and Solutions Marketing for CenturyLink. Prior to CenturyLink, Bill ran Product and Solutions Marketing for NetApp's Analytics and Hadoop solutions. In addition to his marketing role at NetApp, Bill was the Marketing Co-Chair for the Analytics and Big Data committee, SNIA. Bill has also served as a research analyst at IDC and The Hurwitz Group, covering the operating environments, content management and business intelligence markets.

Bill did his undergraduate work at Bentley University, and has completed MBA coursework at Suffolk University.

Published Wednesday, August 08, 2018 8:08 AM by David Marshall
Filed under: ,
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<August 2018>