Virtualization Technology News and Information
VMblog Expert Interview: Shahar Fogel of Rookout Talks Modern Debugging, Understandability, and Agile Flame Graphs

interview rookout shahar fogel

Rookout is a SaaS company helping businesses get the data they need in real time, in order to make better decisions.  VMblog recently connected with their CEO, Shahar Fogel, to discuss a number of important topics.  As advocates for the importance of ‘Understandability' in software, you'll want to hear more how he explains the differences between Observability and Understandability.  And you're going to want to learn more about their latest Agile Flame Graphs, graphs that collect the most useful data across applications, such as CPU consumption and latency between microservices, then visualizes it in an easily accessible manner.    

VMblog:  What does it mean to have a "modern" debugger or a "production" debugger?  Why are the old ways of debugging applications not suitable for cloud-native environments, and what other tools are tackling this problem?

Shahar Fogel:  The "old ways" of debugging are about either reproducing locally and debugging step-by-step, or about adding log lines and hoping that the issue will happen again. In Cloud-Native and distributed environments, and in production environments, these methods are not effective and are sometimes outright impossible to use. 

Complex environments are hard or impossible to reproduce locally, yet debugging step-by-step means stopping your app or pod, which is something you can't do in a production. And adding log lines means waiting anywhere between hours or sometimes days, even for the most agile and automation-driven organizations.

In addition - the old way of debugging does not really allow you to debug a distributed environment with 1000s of instances. Engineers don't know where the issue is invoked, and lack the visibility into the exact point and instance/server the issue has occurred in.

Most tools tackling this problem speak of observability and mention distributed tracing, which is a fancier way of adding log lines (and still requires adding code and waiting). Some tools tackle the problem in a method similar to Rookout's and use terminology such as logpoints, tracepoints or snappoints (we prefer non-breaking breakpoints).

VMblog:  Rookout claims that it's not simply enough to observe applications, but developers need instant access to data in order to better understand them.  Can you talk about the differences between Observability and Understandability?

Fogel:  Exactly right. We've seen Observability take off as a category, given the importance of being able to observe the health of your systems across distributed environments. But it's still very challenging for developers to actually go in and get the data they need to make better decisions, and that's what we are calling Understandability - the ability to quickly understand and get to the root cause of an issue, without the long and cumbersome process which exists today.

VMblog:  You have a new product that looks great called Agile Flame Graphs.  Can you tell the readers a little more about it and how it helps developers better understand their applications?

Fogel:  Flame graphs are the type of graph you expect to see when tackling the complex challenge of profiling an application, in an attempt to find performance bottlenecks and identify the areas in your code that cause high latency, or the "hot spots" that get hit so frequently that they just get stuck, and cause a poor user experience and even crashes.

The purpose of the flame graph is to show you a color-coded breakdown of the time spent in different areas of your code. Traditional tools for profiling applications are too resource intensive to use broadly in production without a significant negative impact on application performance. They also come with a lot of noise, and can only really be operated effectively by operations experts. We wanted to simplify the traditional flame graph to just the most necessary information, so that it could be visualized easily and it didn't create a ton of overhead, while giving engineers the most precise information and metrics related to the performance of their application and code-base.

Rookout Agile Flame 

VMblog:  When I think of flame graphs, it's usually in an IT Ops context.  What made you bring an agile version into the debugging workflow, and do you think in general we are asking developers to care about too many things other than just writing code?

Fogel:  That's a good question. It reminds me of DevSecOps, where we ask is it reasonable for developers to now also care about security and become experts in that as well. And the truth is that, these issues come back to the developer one way or another. Whether it's after it's impacted a customer and you get a JIRA ticket, or you just deal with it in a more productive, shift-left context. I think developers are very interested in having better visibility into performance of their part of the application, but if it's too cumbersome they won't use it. That's why we built Agile Flame Graphs.

VMblog:  To drive home the point about the importance of resilience shifting left, I've heard you talk about bugs like they are "mini outages" for customers -- and that's a really interesting way to put it.  Can you expand on what you mean by that?

Fogel:  The conversations around reliability and resilience tend to be focused on the SRE / Ops side of the house. But the truth is that software bugs are a main cause of outages and customer issues. Even if the system in general is up, the customer's experience of running into a bug or even just the inability to use a specific feature can prevent them from doing what they want to do. Being able to resolve these issues faster for customers is a big part of resilience from my point of view.

Shahar Fogel, CEO of Rookout, has spent the last 2 decades leading data-driven businesses, products and R&D teams, from early stage start-ups to government organizations. Shahar is passionate about software architecture and observability; as a cyber Security team lead, product manager, VC investor, and a Cambridge University MBA alumni.
Published Monday, March 22, 2021 2:45 PM by David Marshall
Filed under: ,
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<March 2021>