An interesting new storage company has launched out of stealth mode, Hammerspace. The company is offering a SaaS solution and is dedicated to simplifying the availability and control of unstructured data across the hybrid cloud. They have created a hybrid cloud data control plane where data exists abstracted from storage and is available to any service, in any cloud or data center. By automating the management of data with metadata-driven machine learning, Hammerspace says it makes it easy to run more jobs faster. To find out more about the new company and its technology, I reached out to their CEO, a long-time storage industry vet, David Flynn.
VMblog: The slogan for your new company is "Data
Beyond Storage," what do you mean by that?
David Flynn: Customers want to move towards
data-driven decision making to improve the customer experience, create new
revenue streams, and enhance innovation.
To get there they adopt technologies like hybrid cloud and advanced analytics,
but progress stalls as they struggle to overcome the barriers of data silos.
People create data silos because they
manage data by managing storage, and the typical approach to working around silos
is to copy data to make it available to data engineers and scientists. This copy approach for giving access to data is
tedious, slow and expensive. IT is forced to deliver against unknown SLAs since
the data user often isn't sure what they need until they have it. The whole exercise doesn't scale.
Hammerspace takes a data-centric perspective to
data, virtualizing data so that users can consume it free from the rigidity of
storage silos. Abstracted from the
infrastructure, data workers can self-service their data on-demand, anywhere
across the hybrid cloud.
VMblog: What does it mean to virtualize data, and how
does that make data accessible across the hybrid cloud?
Flynn: By managing metadata separately from
data, it becomes possible to make unstructured data appear virtually anywhere
without copying it. This data
virtualization is key to overcoming the challenge of storage silos, making data
appear present across the hybrid cloud through an active-active geo-spanning
namespace.
Hammerspace manages the namespace by
pooling together heterogeneous storage resources and replicating the full set
of metadata to any connected cloud or data center, creating a Hybrid Cloud Data
Control Plane.
As workloads access virtualized data,
the machine learning engine learns patterns to predictably fetch only the data
necessary to support the running job; while live-data mobility non-disruptively
moves files and objects to where they need to be, even during read/write.
VMblog: Can you explain the role that
Hammerspace's AI plays in the infrastructure, and what does that mean for IT?
Flynn: AI is a co-pilot to IT, helping to manage the infrastructure by taking
care of tedious and repetitive tasks. A
machine learning engine optimizes to balance performance vs. by continuous monitoring
of telemetry so that IT can spend less time on mundane, reactive tasks and work
on more value-adding work.
Global
access to billions of files across the hybrid cloud demands a unique approach
to data orchestration. The Hammerspace
machine learning engine employs
a continuous market economy simulation between real data and available
infrastructure resources. The model
treats storage services as landlords with resources to lease, and data files as
tenants who spend limited currency to meet specific needs.
Hammerspace continuously collects performance telemetry from
workloads for each file accessed, in the form of metadata. This monitoring
provides a rich understanding of how the infrastructure is performing, so
Hammerspace can automatically correct for issues before they happen. Real-time
decisions for data placement are fully automated, balancing performance and
cost across the hybrid cloud.
VMblog: Hammerspace provides "Data-as-a-Service" to
data consumers. Can you describe what
this is and how does it change the way Data operations teams work?
Flynn: Hammerspace provides both data services and metadata services to
users through a self-service model across the hybrid cloud.
Metadata-as-a-Service is delivered as an on-demand global service,
scaling metadata beyond the POSIX standards of most file systems to keep
metadata consistent across cloud and storage resources. Metadata is extensible
with user-defined tags and keywords, expressed either ad-hoc or by
pre-determined, structured schema. Metadata can be inherited from directories
or managed by per-file granularity. This approach makes it easy and fast to
build data catalogs, do index & search, and even data governance at
file-level granularity. Metadata enabled
global data visibility aids in data discovery, building file data catalogs, and
operationalizing efficient data pipelines to speed up data preparation.
Data
services are available to protect and secure data while serving file and object
data from anywhere across the hybrid cloud.
On-demand access allows data jobs to start right away without waiting
for time-consuming and costly copying, moving the data as it's needed.
VMblog: It sounds like Hammerspace has sophisticated
technology to make it work. Can you describe the user-experience? Is it risky
or difficult to deploy?
Flynn: Customers deploy
Hammerspace as a software appliance in on-premises data centers, or a cloud
service in a public cloud (AWS, Azure, GCP). Either experience takes less than
an hour to install and get going and follows a pay-as-you-go model.
Hammerspace
only needs to observe the infrastructure to get started, and it doesn't store
data or take over storage systems. So,
there is no risk to try it out and to use the product.
Setup is a
semi-automated out-of-the-box experience.
You simply point it at your file data to project it across the hybrid
multi-cloud.
VMblog: Where can readers learn more about Hammerspace?
Flynn: Additional
information about Hammerspace is available on www.hammerspace.com
##