Virtualization Technology News and Information
Article
RSS
VMblog Expert Interview: Justin Kestelyn Details Yellowbrick Data, Data Analytics, Industry Evolution and More

interview yellowbrick kestelyn 

In today's uncertain environment, it's more imperative than ever that IT, data, and analytics departments have strong, dynamic plans to support their businesses with powerful and immediate insights - insights that require a data warehouse that is scalable, flexible, highly available and dependable, and provides superior performance at an economical price.

One of the companies challenging in this space is a company called Yellowbrick.  The company describes itself as a modern data warehouse that's purpose-built for hybrid cloud, providing ultimate flexibility and performing with unmatched speed either in your data center or in any cloud.

To find out more about the company, data analytics, the evolution of this industry and more, VMblog reached out to and spoke with Justin Kestelyn, Yellowbrick's VP of Product Marketing.

VMblog:  What kinds of challenges are driving strategic decisions about data analytics these days?

Justin Kestelyn:  The core problem-finding a way to get actionable value from as much data as possible efficiently and affordably--has been remarkably consistent over the past few years, with the added stressors of growing data volumes, more users with more interactive queries, and more interest in real-time/streamed data. For most enterprises, data warehouse modernization and augmenting data lakes with a fast, reliable SQL analytics layer should be top strategic priorities for meeting that objective. But current solutions in those areas haven't always delivered.

VMblog:  Has the recent disruption caused by the global pandemic changed those challenges in any way?

Kestelyn:  It's axiomatic that uncertainty drives specific behaviors, such as a strengthening interest in reliability and predictability. In this situation, consumers of data analytics are also seeing disruptions and behavior patterns they've never seen before. The capability to do analytics "at the speed of thought" against arbitrary amounts of data has never been more important. 

VMblog:  How has the industry evolved to meet those needs?

Kestelyn:  There's been a lot of innovation in the BI tooling industry in recent years, with exciting new offerings becoming available in recent years. On the processing side, the legacy on-prem data warehouse vendors have struggled to refresh their platforms in a way that produces good price/performance as data volumes grow and concurrent users increase in numbers. And cloud-native data warehouses, which are quite successful in meeting performance service-level objectives for transient common-denominator workloads, often struggle with predictable performance for mixed and complex queries, particularly when lots of data and concurrent users are involved. 

As for getting value from data lakes - well, it's been well documented that repeated open source and commercial attempts to build SQL query engines on top of data lakes just haven't panned out. Many data lakes have literally become "roach motels" for data: It goes in, but it doesn't come out.

VMblog:  What is Yellowbrick Data, and what is its role in that evolution?

Kestelyn:  Yellowbrick is the culmination of all the historical efforts to give data analytics consumers the predictable and reliable price/performance they need, at the massive scale they likely have (or will have), while preserving existing investments in industry-standard BI and ETL tools and skill sets. But to successfully apply all the lessons learned along the way in that journey, some deep and persistent innovation was required. 

To be specific, we had to re-think MPP data processing architecture (and all the software design choices that go along with it) in order to remove the bottlenecks between storage and CPU that to date have severely constrained how much data can be "hot" at any given time.  The result is that with Yellowbrick, data bandwidth is massive, making immense amounts of it instantly query-able. That bandwidth creates opportunities to support lightning-fast (sub-second) ANSI SQL queries--usually orders of magnitude faster than alternatives--on top of PBs of data, by up to thousands of concurrent users using common BI tools like Tableau, SAS, and MicroStrategy. As a bonus, we've abstracted away a lot of the mundane operational tasks, like tuning and indexing, that consume lots of valuable DBA time.

Furthermore, Yellowbrick was designed for hybrid and multi-cloud deployments from the ground up, so you can easily run your workloads wherever it makes the most economical sense. That part is really important because many customers are waking up to the fact that deployment flexibility is the ultimate way to de-risk migrations to the cloud-most can't afford to make an all-or-nothing bet, either on the cloud generally or on a single cloud specifically.  

That's as "modern" as you can get in the data warehousing world.

VMblog:  What are the main use cases for Yellowbrick?

Kestelyn:  Any workload that involves lots of data, lots of mixed and complex queries, and lots of concurrent BI users is a good candidate for Yellowbrick. To date, most of our customers have either modernized their data warehouse with Yellowbrick, or are using it to complement a pre-existing data lake. 

VMblog:  And finally, can you share what's next for Yellowbrick?

Kestelyn:  We're going to continue to break down the barriers that have frustrated data analytics consumers, widening our price/performance lead over competitors. We're also doubling down on predictability and stability, and are very focused on building out cloud functionality to make the platform even more consumable that way.

Overall, we're bullish about the future, and about Yellowbrick's ability to solve more customer problems in data analytics.

##

Published Thursday, April 09, 2020 7:37 AM by David Marshall
Filed under: ,
Comments
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
top25
Calendar
<April 2020>
SuMoTuWeThFrSa
2930311234
567891011
12131415161718
19202122232425
262728293012
3456789