In today's uncertain environment, it's more imperative than ever that
IT, data, and analytics departments have strong, dynamic plans to
support their businesses with powerful and immediate insights - insights
that require a data warehouse that is scalable, flexible, highly
available and dependable, and provides superior performance at an
economical price.
One of the companies challenging in this space is a company called Yellowbrick. The company describes itself as a modern data
warehouse that's purpose-built for hybrid cloud, providing ultimate flexibility
and performing with unmatched speed either in your data center or in
any cloud.
To find out more about the company, data analytics, the evolution of this industry and more, VMblog reached out to and spoke with Justin
Kestelyn, Yellowbrick's VP of Product Marketing.
VMblog: What kinds of challenges are driving strategic decisions
about data analytics these days?
Justin
Kestelyn: The core problem-finding a way to get actionable value from
as much data as possible efficiently and affordably--has been remarkably
consistent over the past few years, with the added stressors of growing data
volumes, more users with more interactive queries, and more interest in real-time/streamed
data. For most enterprises, data warehouse modernization and augmenting data
lakes with a fast, reliable SQL analytics layer should be top strategic
priorities for meeting that objective. But current solutions in those areas
haven't always delivered.
VMblog: Has the recent disruption caused by the global pandemic
changed those challenges in any way?
Kestelyn: It's axiomatic that uncertainty drives specific behaviors,
such as a strengthening interest in reliability and predictability. In this situation,
consumers of data analytics are also seeing disruptions and behavior patterns
they've never seen before. The capability to do analytics "at the speed of
thought" against arbitrary amounts of data has never been more important.
VMblog: How has the industry evolved to meet those needs?
Kestelyn: There's been a lot of innovation in the BI tooling industry
in recent years, with exciting new offerings becoming available in recent years.
On the processing side, the legacy on-prem data warehouse vendors have
struggled to refresh their platforms in a way that produces good
price/performance as data volumes grow and concurrent users increase in numbers.
And cloud-native data warehouses, which are quite successful in meeting performance
service-level objectives for transient common-denominator workloads, often struggle
with predictable performance for mixed and complex queries, particularly when
lots of data and concurrent users are involved.
As for getting value from data lakes - well, it's been well
documented that repeated open source and commercial attempts to build SQL query
engines on top of data lakes just haven't panned out. Many data lakes have
literally become "roach motels" for data: It goes in, but it doesn't come out.
VMblog: What is Yellowbrick Data, and what is its role in that
evolution?
Kestelyn: Yellowbrick is the culmination of all the historical efforts
to give data analytics consumers the predictable and reliable price/performance
they need, at the massive scale they likely have (or will have), while
preserving existing investments in industry-standard BI and ETL tools and skill
sets. But to successfully apply all the lessons learned along the way in that
journey, some deep and persistent innovation was required.
To be specific, we had to re-think MPP data processing
architecture (and all the software design choices that go along with it) in
order to remove the bottlenecks between storage and CPU that to date have severely
constrained how much data can be "hot" at any given time. The result is that with Yellowbrick, data
bandwidth is massive, making immense amounts of it instantly query-able. That bandwidth
creates opportunities to support lightning-fast (sub-second) ANSI SQL queries--usually
orders of magnitude faster than alternatives--on top of PBs of data, by up to thousands
of concurrent users using common BI tools like Tableau, SAS, and MicroStrategy.
As a bonus, we've abstracted away a lot of the mundane operational tasks, like
tuning and indexing, that consume lots of valuable DBA time.
Furthermore, Yellowbrick was designed for hybrid and
multi-cloud deployments from the ground up, so you can easily run your
workloads wherever it makes the most economical sense. That part is really
important because many customers are waking up to the fact that deployment
flexibility is the ultimate way to de-risk migrations to the cloud-most can't
afford to make an all-or-nothing bet, either on the cloud generally or on a single
cloud specifically.
That's as "modern" as you can get in the data warehousing
world.
VMblog: What are the main use cases for Yellowbrick?
Kestelyn: Any workload that involves lots of data, lots of mixed and
complex queries, and lots of concurrent BI users is a good candidate for
Yellowbrick. To date, most of our customers have either modernized their data
warehouse with Yellowbrick, or are using it to complement a pre-existing data
lake.
VMblog: And finally, can you share what's next for Yellowbrick?
Kestelyn: We're going to continue to break down the barriers that have
frustrated data analytics consumers, widening our price/performance lead over
competitors. We're also doubling down on predictability and stability, and are
very focused on building out cloud functionality to make the platform even more
consumable that way.
Overall, we're bullish about the future, and about Yellowbrick's
ability to solve more customer problems in data analytics.
##