By Richard Whitehead, Chief Evangelist at Moogsoft
Observability capabilities - like finding and prioritizing outages of
legacy tools in seconds - have exploded in demand over the past year. While
unified monitoring, log management and event management vendors have reoriented
their technology portfolios towards observability, this new concept has created
a hefty amount of confusion in the market for those stuck on legacy tools and processes.
IT Operations and Service Management (ITOSM)
professionals often wonder whether the new terminology signifies something
genuinely innovative that responds to new requirements or is it another attempt
to use marketing buzzwords to stay relevant without going through the hoops of
actual technological change. DevOps-oriented Software Engineers (SWEs) and Site
Reliability Engineers (SREs), who are historically at the forefront of the
demand for observability, are even more skeptical.
Naturally, questions arise. Is the idea of
repurposing or modifying traditional monitoring technology to meet the demands
of observability feasible? What actions should vendors take to ensure long-term
success?
Observability is a critical component to legacy
modernization, but let's dig in more to get the skeptics on board.
What is
Observability in context?
The term observability originates from Control
Theory. A system is considered observable if its state can be inferred from its
inputs and outputs. So what does this mean for DevOps, and why the increased
interest in observability within log management, event management and
monitoring?
The Application Performance Monitoring (APM)
industry took off between 2013 and 2015. Digitalization, and "software eats the world" elevated IT and made these teams critical to business
success, and applications were the primary link between the business, customers
and IT. As a result, the tools created to monitor these applications also took
off in importance. But unfortunately, in tandem, pressures rose to increase
agility and move quickly while also effectively monitoring application
performance. And all of it fell on developers.
These changes led the DevOps community to
evaluate the APM tools their ITOSM teams were using, and they found that the
scales at which product operated were much too slow for systems SREs were
managing. APM technologies were failing to make DevOps applications observable.
Observability
to the rescue
Not surprisingly, the DevOps community was right.
The world had changed, and observability had to become a reality. Step one was
to recognize the data feeding tools (metrics and logs) needed to be
supplemented with ingestion rates that came as close as possible to matching
the state of change within the system. Data feeds had to evolve and be much
more granular and low-level, coming directly from underlying telemetry without
any layers of structure intervening. This was achieved by adding tracing to the
mix, (forming the "three pillars" of observability) along with enhanced
self-reporting from services. Finally, events were added, to create the
observability MELT stack. While this makes systems more observable, there was
still a missing piece: artificial intelligence.
The only way insightful patterns can be mapped in
data feeds is if AI or machine learning are deployed. Data has to move at
microsecond time scales, but insights must move just as fast. Modern
observability tools must have two features: the technology must rely on
low-level, granular data feeds and deploy AI to the highest degree to identify
significant patterns in data sets. If those two components are not present,
turn down the vendor because they are not offering observable monitoring.
Monitoring tools cannot effectively operate
rooted in legacy tools. SREs and DevOps teams have to move rapidly, and the
only way to efficiently achieve this is by leveraging real observability with
AI. Together the pair can bring laggard systems into the future.
##
ABOUT THE
AUTHOR
As
Moogsoft's chief evangelist, Richard Whitehead brings a keen
sense of what is required to build transformational solutions. A former CTO and
technology VP, Richard brought new technologies to market and was responsible
for strategy, partnerships and product research. Richard served on Splunk's
Technology Advisory Board through their Series A, providing product and market
guidance. He served on the advisory boards of RedSeal and Meriton Networks, was
a charter member of the TMF NGOSS architecture committee, chaired a DMTF Working
Group, and recently co-chaired the ONUG Monitoring & Observability Working
Group. Richard holds three patents and is considered dangerous with JavaScript.