Virtualization Technology News and Information
Article
RSS
Complex Context - The Key to Winning the War Against Application IN-Visibility

By Chris Farrell, Observabilty Strategist at Instana

One of my favorite concepts made popular by the "Big Bang Theory" is Schrodinger's Cat. I have an S-Cat PC sticker AND T-Shirt, each with a different joke about Mr. Schrodinger's feline. One is "Wanted: Dead and Alive," which gets a laugh from people that know what it is, and some extremely serious double-takes and frowns from those that think I'm advocating catricide.

But as fun as the T-shirt is, it's the sticker that I want to focus on today. It's the start (and end) of a great Schrodinger's Cat joke:

Schrodinger's Cat walks into a bar...

... and doesn't.

So what does this hilarious little physics and philosophy joke have to do with Cloud-Native Applications? I'm glad you asked - or at least read this far. I think of Mr. S whenever someone wants to start a debate with me about "Observability," especially comparing  "O" (as I like to call it) to "Monitoring"  - or Monitoring's Irish twin, "Visibility." It's the biggest non-debate debate in IT today. Now before you decide to love me or hate me (read the full article, then you can hate me), let me first say that I am an advocate for Observability - and all the hidden meanings that someone that says they're strategy for delivering high performance is the Big O!

No, what I'm talking about is the constant debate that Observability advocates seem to want to have about monitoring and visibility, especially as it pertains to something near and dear to my heart -managing application performance and Application Performance Management (APM).

Here's my answer to the all-important "Observability vs. Monitoring" debate.

If someone asks me "should I strive for visibility or observability?" I say "Yes!"

When somebody asks "do I need monitoring or observability?" I say Yes!"

And if anyone ever asks "should I use proprietary or open source observability?" I say "Absolutely!"

Maybe one day, I can make a "Who's on First Script" to go with these answers. In the meantime, let me explain why I don't think the questions are the right questions to ask, while I also explain my "answers."

Let's back up to the early days of EAI applications, J2EE and the rise of Web Application Servers. It's also the birth of APM - just about 20 years ago. Even back then, the debate raged: "visibility" or  "observability." It just wasn't widely referred to as observability at the time. Also, there was only one observability API, and while it was a community standard (ARM), it wasn't open source. More sources of data appeared over time - even the J2EE servers got into the business, delivering object-level timing and load metrics via another API. So why were the APM solutions so successful that they spawned multiple generations of ever-growing successful companies? In a word - Complexity.

And now you're thinking "Complexity in a 3-tiered monolith? You must be joking."

Of course, I'm not joking. Yes, the Business Logic code within a monolithic application doesn't have the obvious complex interactions of microservice, multi-cloud and Cloud-Native applications. But these were applications running mission critical processes and delivering transactional requests to end users. Of course code execution could be as complex as you wanted it, but in the world of Observability, we're concerned with interactive complexity. For these applications, the complexity appeared at the layer behind the app server, between and including the back-end legacy systems that the Java application relied on for executing user requests.

There are two areas where complexity is introduced into a monolithic J2EE application. The first is where the application uses App Server (and JVM) resources, from simple I/O to graphic services. The other complex area is the interface to back-end systems, whether directly called with APIs or handled through specialty App Server (Java) connectors. These two areas are beyond the reach of those APIs - making them invisible to the "Observability" solutions of the day. While "easier" problems could be solved with simple response time, the more difficult problems - the kind that landed banks on the front page of The Wall Street Journal for failed online banking applications - those problems required visibility into the complex areas, with an understanding of how and why different back-end systems were called - thus even on monoliths, the concept of CONTEXT is critical to understanding how applications execute their requests.

Let's jump ahead - skipping past Service-Oriented Architecture and getting to the heart of today's debate - how best to monitor the performance of Cloud-Native applications (microservices, containers, orchestration). For Cloud-Native applications, the architectural architecture is obviously technical, spanning numerous service layers, a polyglot of languages, even Nobody would argue that the layout, architecture and usage of microservice applications are inherently complex. Just the communication from one service to another can be a problem.

Analogous to the J2EE APIs (both native to the app servers and coded in by developers), there are multiple methods for getting basic performance data (and some NOT so basic data) about microservices. Many of the technologies operating in cloud-native applications have performance APIs built into the platform / infrastructure - from databases to message queues, there's a set of basic data available to anyone that has the API.

That's in addition to the new new set of monitoring AND TRACING APIs available to developers to insert observability into their application code - Prometheus is an example of metric instrumentation, while Jaeger, Zipkin and OpenTracing can help obtain distributed trace data. From Observability tools to next-generation APM and log analysis solutions, this data is a part of the ability to see performance. But as we learned twenty-odd years ago, it's not JUST about being able to take measurements. To optimize performance - AND SOLVE PERFORMANCE ISSUES - we have to be able to break down the complexity and understand relationships in that part of the system.

The real question to ask for proper Cloud-native application performance monitoring is where is the application complexity that can derail performance, create outages and hide under the guise of "there's a problem, but all lights are green" situations? Remember, in monoliths, that complexit lies in the code-to-back-end layer. But in Cloud-Native environments, the answer isn't just different - it's on a differenct scale. And that answer is - the complexity is EVERYWHERE.

Lossely connected services can take any path for transactions. Developers are optimizing their owned services by choosing specific technologies, so the need to monitor, support and debug multiple databases, messaging, even programming languages - has become a permanent need.

So with complexity everywhere, how can we break down any walls caused by said complexity to understand where our bottlenecks are and what's causing any application and/or service problems.

The answer is the same as with monoliths - you have to take measurements and gather data with an understanding of the context to the actual use calls and use cases. For Cloud-Native, that means being able to do a few key items:

  • Discover changes (new services, deleted services and service updates) in real time
  • Understand all the upstream and downstream relationships (inter-dependencies) of each service
  • Correlate all the information at hand (performance metrics, individual traces, profiles (if you have them) - and include data from all sources (open source API, monitoring agents, traces, profies, etc>

Only with all the data in hand - and in the analysis engine (whatever that engine is) - can a Dev+Ops team begin to understand how their applications are doing, and how to optimize performance, resource usage and service levels.

##

***To learn more about containerized infrastructure and cloud native technologies, consider joining us at KubeCon + CloudNativeCon NA Virtual, November 17-20.

About the Author

Chris Farrell, Observability and APM Strategist 

Chris Farrell 

Chris Farrell is a Technical Director and Observability Strategist at Instana. He has over 25 years of experience in technology, from development to sales and marketing. As Wily Technology's first Product Management Director, Chris helped launch the APM industry about twenty years ago. Since then, Chris has led Marketing or Product strategy for 5 other APM / Systems Management ventures.

Chris's diverse experience runs the technology gamut, from manufacturing engineering and development for IBM ThinkPad to managing global sales and marketing teams.

Chris lives in Raleigh, North Carolina with his wife and two Siamese cats. He enjoys both watching and playing basketball in his spare time - USUALLY. He has a BSEE and MBA from Duke University.

Published Thursday, November 05, 2020 7:26 AM by David Marshall
Comments
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
Calendar
<November 2020>
SuMoTuWeThFrSa
25262728293031
1234567
891011121314
15161718192021
22232425262728
293012345