Today, we welcome the CEO of NeuroBlade, Elad Sity, as he
discusses all things data analytics, and what organizations that have lots of
data can do to overcome what is fast becoming a significant inhibitor to growth
- the data analytics gap.
VMblog: Elad, please tell
the readers a bit more about yourself before we delve into the details of data
analytics.
Elad Sity: Thanks for having me. I'm the Founder and CEO of
NeuroBlade, and our mission is to innovate at the infrastructure layer to
create a new standard for how data analytics is done - making it faster and
cheaper through more acceleration and better performance.
My co-founder and CTO, Eliad Hillel, and I are graduates
of a military intelligence tech unit of the Israel Defense Forces. I was also a
senior executive at SolarEdge, a global supplier of smart energy solutions, and
one of the largest public companies in Israel today, before establishing
NeuroBlade in 2018. My background is in building systems that combine hardware
and software to generate a bigger value prop by leveraging both worlds.
So, technology is a huge part of who we are, and it has
helped shape our vision for NeuroBlade.
VMblog: So, what is
NeuroBlade, and what is it you do?
Sity: I mentioned the data analytics gap, and that's a good
place to start.
Data is growing exponentially. I think that has been and
is being discussed widely. Let's go back to why we need this data. We need it
for insights, we need it for better businesses, better life and so on. But for
that - we need to process all this data. Unfortunately, there's a gap in the
computing resources required to process this data effectively and the amount of
data collected and this gap just keeps getting bigger and bigger.
We all own a phone. When the first iPhone was released, it
came with 16GB memory. The recent model ships with 1TB of memory. So, in just
14 years, the phone's memory increased 64 times, roughly doubling every two
years. Now imagine a world full of people using phones that generate these
massive amounts of data, and that's not even taking into account all the "meta"
data that is being generated from location tracking, click stream data, etc.
which is collected by other organizations through our phone usage.
Organizations have to extract insights from this data to
make sense of it. The aim is to not only generate more revenue but also make
smarter decisions to innovate and grow - which requires systems that are
capable of analyzing data faster than ever before.
The latest research suggests that as much as 2/3 of the
data which organizations have access to is not analyzed and in many situations,
IT's solution is just throwing more servers at the problem, hoping it will
help. Guess what? It doesn't.
And this is where NeuroBlade comes in.
We optimize the
entire data analytics journey, end to end, thinking vertically from software
and compute to the network and storage layers. We're thinking cloud native from
the beginning, by adding an acceleration layer between the compute and storage
which lets you get to a vertically integrated, purpose built approach for the
analytics workload. Our approach is open to any database, any cloud, any
storage. We call this hypercompute for analytics - the next generation of
compute systems, for analytics.
VMblog: Can you put this into context? Practically, how
much can you really improve the analytics cycle?
Sity: Using the
approach we've developed, we see orders of magnitude of improvement in
analytics performance and TCO, to the tune of 10-100x improvement in price performance
over the fragmented approach that standard systems stick to. This is using our
current technology generation, but we see even better numbers ahead. If I were
looking at the improvements in traditional "piecemeal" approaches, it would be
at best, 10%-30% incremental improvement, and sometimes in a real life
scenario, not even that.
VMblog: So, tell us more about this. What is the
'traditional' approach to data analysis, and why can't it keep up?
Sity: The industry has
been approaching the data analysis gap from different angles.
Database software
providers are looking to optimize data analytics using smarter indexing or
querying techniques or even just better code.
Cloud and
infrastructure providers, including hyperscalers, have been optimizing either a
CPU-based or GPU based approach to run that database software faster. For
example, AWS Graviton processors based on ARM can now improve the
price-performance curve for database companies like Databricks and Snowflake.
From the
networking side, we see smart NICs and DPU sort of solutions, to improve the
data flow in the system. And lastly, storage vendors are innovating faster
storage solutions and smart storage solutions to try and speed up closer to the
data.
For example, a
smart SSD can do a specific kernel 1000x faster, but in real life scenarios, it
may not even give a 10% boost to the workload, because of the knock-on effect
on the rest of the data analytics supply chain.
The bottom line
is: we're all trying to make each layer of the stack more computationally
powerful and more intelligent. But to achieve the "best of all possible worlds"
, optimizing each piece by 10%-20% will not get you there. You need to look at
the entire stack end to end and make it all work together and most importantly
seamlessly.
VMblog: How is NeuroBlade addressing this problem?
Sity: Our solution is an acceleration layer that sits between
compute and storage, called the hardware-enhanced query system or HEQS. Our
vision is that this works with any database, any cloud, any storage.
The HEQS snaps into the core analytical engine software,
such as Spark, Trino, or any other DBMS system by using our API, and then it
automatically begins to offload queries that HEQS can execute to run on our
acceleration appliance instead of on your existing infrastructure.
The HEQS system is based on our HEQS processing unit,
which is a processor we invented and is where a lot of the magic happens. It is
tailored for analytical workloads with high I/O and high memory needs, and when
integrated in our appliance solution, along with the networking and storage
elements, it enables the orders of magnitude faster processing we discussed.
Now, there are many hardware elements in the system. We
don't expect the user to know how to extract the performance from all of these.
Our software stack orchestrates all the hardware pieces together, making sure
there are no bottlenecks in the system and every component is utilized to the
fullest. And, as promised, it is all seamless for the user.
VMblog: What can
analyzing data faster yield in the future?
Sity: Let's just take a couple of examples.
During the pandemic, biotech companies used data for everything from tracking the spread of the virus to
developing new drugs. To support data analytics like this, a powerful data
analytics system, which reduced the analysis time by 35%,
was built. Imagine what would have happened if we could speed that up even
more.
Another example from a different domain, which we have a
very big customer in, is semiconductor manufacturing. And by the way, this
actually applies to any manufacturing facility. There is so much data generated
and gathered at the manufacturing floor, and allow me to quote the customer
here - "It takes me a full week to understand what happened today and what I
have to fix". If we could shorten the 7 days to a few hours, this would allow
for better yield, quality and of course quantity, all translate to a bigger and
more profitable business.
Simply put, hours of work are reduced and executed in
minutes. In 2023, that will be game changing for many fields and businesses.
VMblog: And finally, what is your
vision for hypercompute for analytics?
Sity: We're at the beginning of an industry-wide recognition
that we need to address the infrastructure that supports analytical workloads.
We can't keep throwing servers at the problem - per
Einstein, that would just fall into the definition of insanity.
Imagine you could do everything in your day, 100 times
faster - the possibilities are endless.
NeuroBlade does this for data analytics, speeding up data
processing by 100 times and it opens up a new world of data exploration and it
will be exciting to see the things companies and people will do with it.
##