Virtualization Technology News and Information
Grafana Labs 2023 Predictions: The Rise of "Application Observability"


Industry executives and experts share their predictions for 2023.  Read them in this 15th annual series exclusive.

The Rise of "Application Observability"

By Tom Wilkie, VP of Technology at Grafana Labs

We're all familiar with the term APM: Application Performance Monitoring appeared in the 1990s when the first solutions were offered by companies like Precise, Wily, Mercury Interactive, and Quest. Then around 2008, modern architectures ushered in the need for the next phase of APM that came from organizations like Dynatrace, AppDynamics, and New Relic. In recent years, the term APM has been owned by a series of vendors and skewed so much that many people don't even really understand what it means.

The fact is that the space has continued to evolve with the needs of our rapidly changing tech industry. We have moved beyond just APM and need to adopt a new vocabulary to accurately describe our environments. Thus, in 2023, say hello to "Application Observability." I can't take credit for the invention of this term, but I certainly support its rise in popularity to better describe this evolving area of observability.

Observability has traditionally been thought of as three (arguably four) pillars, but this approach is focused on the technology - and not the user or the challenges they're trying to solve. Over the past few years, this approach has been thoroughly debunked. We've started to talk about how people use these tools and the classes of problems they solve with them. An example here is Infrastructure Observability, where Ops folks, SREs and DevOps practitioners use these tools to better understand the behavior of our physical, virtual or software infrastructure - the computers, memory, disks etc but also the databases, schedulers, queues etc.

Application observability is what happens when you consider observability through the lens of an application developer. Simply put, it's the way of using these tools that helps you understand the behavior of your application. Now this is, for all intents and purposes, still APM, but "application observability" is a term that more accurately describes what we really mean.

Observability came into our lexicon in recent years, and admittedly, it's seen a lot of hype. In a way, observability has been an evolution of how we think about monitoring. Every vendor, whether they are coming at it from monitoring, APM, logs, traces, etc., is helping to build a better way of monitoring modern software. Cindy Sridharan, who literally wrote the book on observability, once joked that the observability nomenclature came about because developers didn't like to do monitoring. While there may be some truth in that statement, it's indisputable that the way we develop, monitor, and deploy our software and internet infrastructure has completely changed.

Change is occuring due to four critical areas:

  • Complexity: A decade ago, software was a lot simpler. We built monoliths, and failure modes were known. But today we build distributed systems and microservices, and the interactions between every component of our stack can get really complicated because there are so many interactions. In fact, there are so many of these interactions that it's hard to even understand the numerous ways our applications can fail.
  • Volume: The volume of the data collected around our applications has exploded in recent years. Ten years ago, we might have had an application deployed on a few servers. Those servers evolved into a few dozen or a few hundred virtual machines, then a few thousand containers and microservices. The complexity of software has made the volume of data go through the roof.
  • Variability: Servers used to sit statically in a rack in the data center. You'd order new servers; they'd take weeks to be shipped, deployed, racked and stacked, and installed; and then those servers would sit around for years. But with the advent of things like containers, Kubernetes, and now serverless, our infrastructure is truly elastic, and we're doing dynamic load balancing and auto scaling. Infrastructure is coming and going. So today our infrastructure is becoming extremely variable and more elastic than ever before. 
  • Velocity: It used to be that we collected data every 10 minutes, but today we're increasingly moving toward real time. Now most organizations want to collect data multiple times per minute-every 10 seconds or even more frequently. 

Together, these four factors forced a paradigm shift in how we think about monitoring. The fact is that most failure modes are no longer understood in advance. If we solely rely on checking the known knowns, our monitoring will quickly fall short. Monitoring has become a data analytics problem. The reality is that the advancements of our complex systems have created the need to see into the realm of unknown unknowns - and this is where observability shines.  

Industry analysts have also noted the connection between APM and observability, last year for the first time Gartner added the word "observability" to the title of their APM Magic Quadrant and expanded the definition of the space to include observability.

We're creatures of habit, and I expect the term APM will stay in our vocabulary for quite some time. But we consider "application observability" the better phrase to reflect many of the modern environments and use cases users encounter today.

Currently, when we think of application observability, it does not yet include the complete platform of all the things that have been included in APM over the years. When one platform tries to do it all, it often falls short in one way or another, and the APM platforms haven't been immune to this. Compare that to something wildly popular like open source technology, which has a brilliant underpinning but - let's just say it - a slightly crappy experience, right?

There are also parallels within infrastructure observability. This is where tools like Prometheus and Grafana are more commonly used to help users understand the behavior of their infrastructure. I like to think about it like this: "Infrastructure Observability" mostly manifests in the use of logs and metrics, whereas "Application Observability" manifests in the use of distributed tracing and continuous profiling. Different techniques and technologies are needed for different jobs.

It's important to note that, despite this being my 2023 prediction, this is nothing new. People have been using metrics, logs, traces, and profiles to understand their application behaviors for ages. But having a common language and set of terms to describe what we're doing helps share knowledge and learning, and hopefully newer terms with more focused meaning will help avoid misunderstanding. And not using terms which have been gerrymandered by vendors is always a win.

And so, my prediction for 2023 is not just the rise of the term "application observability" - but also steps toward improving the user experience in the open source world.



Tom Wilkie 

Tom Wilkie is VP of Technology at Grafana Labs, a member of the Prometheus team, and one of the original authors of the Cortex and Loki projects. He serves as a member of the CNCF Governing Board. In his spare time, he builds 3D printers and makes craft beer.

Published Friday, January 27, 2023 7:36 AM by David Marshall
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<January 2023>